AD  7  S  44  3  4 


PURDUE  UNIVERSITY 


DEPARTMENT  OF  STATISTICS 

Reproduced  by 

NATIONAL  TECHNICAL 
INFORMATION  SERVICE 

V*.  21111 

DIVISION  OF  MATHEMATICAL  SCIENCES 


Unclassified 


Security  Clottificotion 


1  DOCUMENT  CONTROL  DATA  •  R&D 

I  (t+ntHiy  el«ee<M(arlon  of  tltla.  body  •/  «be(tec(  and  annotation  must  l«  «nl«red  the  awmll  rapori  it  ct'ttiff#d> 

1  OAIOINATIN  0  ACTIVITY  (Caaamam 

tA  AEAQNT  IKCUAITV  C  L  ASSlFIC  A  TION  1 

Unclassified 

Purdue  University 

10  ANOUA 

1  MFOHT  TITLI 

On  Multiple  Decision  (Subset  Selection)  Procedures 

4  DCSCftIPTIVC  MOTCt  (I>pt  «f  rm^orf  and  biibiairt  data*) 

Technical  Report,  December  1971 

t  AUTNORW  (Last  nmma,  0mt  nmma.  Initial) 

Gupta,  Shanti  S.  and  P&nchapakesan,  S. 

•  NCPONTOATf 

December  1971 

7«.  total  no.  of  aacui 

90 

73 

N00014-67-A-0226-00014 

k  PNOJICT  MO 

M-  ORIGINATOR'*  REPORT  NUMlIOft) 

Mimeo  Series  #273 

c. 

4- 

»*■  9T*«»  ojeowT  NOC«J  (A nr  mm  amt  *a  aaaJ*a4 

10  AVAIL  A.ILITY /LIMITATION  NOTICU 

Distribution  of  this  document  is  unlimited. 

- V - — - 

u.  •POMomuu  militant  activity 

Office  of  Naval  Research 

Washington,  D.  C. 

II  A 


,  ,  r,reS°Tt  is  a  surv®y  of  developments  and  significant  results  in  the  area 

of  multiple  decision  procedures  under  the  subset  selection  formulation.  Section 
2  deals  with  procedures  for  location  and  scale  parameters.  A  general  theory  of 
the  subset  selection  problem  and  a  decision-theoretic  formulation  are  discussed 
in  Section  3.  Sections  4  through  7  deal  with  parametric  and  non-parametric 
procedures  for  discrete  populations,  multinomial  cells  and  multivariate  normal 

usi"J  single-stage  sampling,  inverse  sampling  and  sequential  sampling 
Section  8  describes  procedures  applicable  to  restricted  families  of  distributions 
such  as  the  increasing  failure  rate  (IFR)  and  increasing  failure  rate  on  the 

(IF^  distributions.  Bayes  and  empirical  Bayes  procedures  are  discussed 

Tf'°  last  *«ction  summarizes  briefly  several  modifications  of  the 
basic  problem  and  goal.  . 
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On  Multiple  Decision  (Subset  Selection)  Procedures 


Shanti  S.  Gupta 
Purdue  University 

and 

S.  Panchapakesan 
Southern  Illinois  University 

1 .  Introduction 

In  many  of  the  experimental  situations  the  experimenter  is  confronted 
with  the  problem  of  making  decisions  regarding  k  populations,  which,  for 
example,  may  be  categories  of  wheat,  manufactured  items  coming  out  of  k 
factories  or  candidates  who  are  contenders  for  an  award.  The  classical 
tests  of  homogeneity  which  have  been  applied  in  these  situations  do  not 
supply  the  information  the  experimenter  really  seeks,  whether  or  not  the 
tests  yield  significant  results.  In  fact,  the  experimenter’s  problems 
begin  when  he  obtains  a  significant  result  which  goes  to  reject  the  null 
hypothesis  that  the  populations  are  identical.  As  a  partial  answer  to  the 
need  for  a  more  realistic  formulation  overcoming  the  inadequacy  of  the  tests  , 

of  homogeneity,  Mosteller  (1948)  tested  homogeneity  against  slippage  alter¬ 
natives.  Since  then  many  authors  have  contributed  to  the  theory  of  slippage 
tests. 

The  initial  efforts  in  the  direction  of  multiple  decision  problems  were 
made  by  Paulson  (1949)  who  considered  the  problem  of  classifying  the  given  j 

populations  into  a  "superior"  and  an  "inferior"  group.  Later  he  (1952)  S 

investigated  the  problem  of  selecting  the  "best"  of  k  categories  when  com¬ 
paring  (k-1)  experimental  categories  with  a  standard  or  control.  Bahadur  (1950) 

4 
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has  made  some  early  contributions  to  the  theory  of  k  sample  problems. 

Bahadur  and  Robbins  (1950)  obtained  some  minimax  rules  for  selecting  from 
two  populations  the  one  with  the  greater  mean.  The  multiple  decision  prob¬ 
lems  that  are  now  known  as  the  ranking  and  selection  problems  have  been 
formulated  mainly  in  two  ways.  The  first  one  is  known  as  the  indiffeisnce 
zone  formulation  due  to  Bechhofer  (1954).  This  formulation,  in  its  simplest 
form,  selects  one  of  the  populations  as  the  best  with  a  guarantee  that  the 
true  best  population  is  selected  with  at  least  a  preassigned  probability  P* 
whenever  the  best  and  the  second  best  populations  are  "sufficiently"  far 
apart.  For  an  exposition  of  this  formulation  the  reader  is  referred  to  the 
excellent  monograph  by  Bechhofer,  Kiefer  and  Sobel  (1968).  The  main  investi¬ 
gations  surveyed  in  the  present  paper  are  under  the  second  formulation  due 
to  Gupta  (1956)  known  as  the  subset  selection  formulation.  The  goal  here  is 
to  select  a  non-empty  subset  of  the  given  populations  so  that  the  selected 
subset  includes  the  best  population  with  at  least  a  preassigned  probability 
P* .  It  is  usually  desired  that  this  be  accomplished  by  selecting  a  subset 
as  small  as  possible  and  without  any  knowledge  of  the  true  values  of  the 
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usually  defined  os  the  best  population.  In  the  case  of  a  tie,  we  assume 
th£t  one  of  the  populations  with  X^  ■  X^j  (or  X.  =  X^)  ta88ec^  as  the 
be;t.  The  selection  of  any  subset  which  includes  the  best  population  is  called 
a  correct  selection  (CS)  and  P{CS|r}  denotes  the  probability  of  a  correct 
selection  using  the  rule  R.  Thus  we  are  interested  in  defining  a  rule  R 
such  that 

(1.1)  P{CS  1 R }  >  P*  ,  k'1  <  P*  <  1  , 

regardless  of  the  true  parameter  point  X  =  (Xj,...,X^)  in  the  parameter  space 
ft  =  (X).  If  the  distributions  are  not  indexed  by  the  values  of  any  parameter 
X,  ft  denotes  the  space  of  the  k-tuples  where  is  the  distribution 

function  of  it  .  In  order  that  (1.1)  be  met,  we  want 

(1.2)  inf  Pf.CSjR)  >_  P*  . 

ft 

The  requirement  (1.2)  is  usually  referred  to  as  the  basic  probability  require¬ 
ment  or  the  P*- condition. 

2.  Selection  in  terms  of  Location  and  Scale  Parameters. 

Many  of  the  early  investigations  relate  to  ranking  and  selection  of 

populations  in  terms  of  either  location  or  scale  parameters.  The  ranking 

of  normal  means  and  gamma  shape  parameters  are  examples  of  this  type. 

Let  us  first  suppose  that  ir(i  *  l,...,k)  has  the  continuous  distribution 

F,  (x)  «  F (x-X . ) ,  -  «>  <  X.  <  »  and  x.  is  an  observation  from  tt.  In  order  to 

X^  vi"  x  l  l 

select  a  subset  containing  the  population  associated  with  X^j.  we  define  the 
following  rule  R^. 


.tUjmH.ju  «.<df inutfi 
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(2.1)  R, :  Select  it.  iff  x.  >  x  -  d 

1  l  l  —  max 

where  xmax  *  max(x^ , . . . ,x^)  and  d  is  a  positive  constant  chosen  so  as  to 
satisfy  the  basic  probability  requirement.  It  is  easy  to  see  that 

®  k-1 

(2.2)  PtCSlRj)  -  /  F (y+d+X [k] "* [ j ] )  dF00  • 

Clearly,  the  infimum  of  P(CS|Rj)  is  attained  when  X^  *...■  X^  and  hence  d  is  given  by 

ao 

(2.3)  /f*"1 (y*d)  dF(y)  -  P*. 

•  00 

Denoting  by  S  the  number  of  populations  included  in  the  selected  subset, 
we  can  see  that 


(2.4) 


E(S)  ■  px  pk  , 


where  p^  is  the  probability  that  the  population  associated  with  X^j  is 
included  in  the  subset.  In  the  present  case 

00 

(2.S)  p.  x  /  F(y»d*XIiJ  -X{j])  dF(y)  . 

j+i 

It  has  been  shown  by  Gupta  (1965)  that  sup  E(S)  is  attained  when  X«...»X. 

n 

provided  that  the  density  fx(x)  =  f(x-X)  has  a  monotone  likelihood  ratio 

in  x  and  in  that  case  the  supremum  is  kP*.  The  procedure  R^  has  also  been 

shown  to  be  montone  in  the  sense  that  p.  >  p.  for  X,.,  >  Xr.,. 

As  an  application  of  the  above  results,  we  consider  selecting  a  subset 

containing  the  population  with  the  largest  mean  from  k  independent  normal 

2 

populations  with  unknown  means  and  a  common  known  variance  a  . 
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If  y^(i  -  l,...,n)  is  the  sample  mean  based  on  n  observations  from  tk,  the 

rule  R,  in  this  case  selects  ir.  iff  y.  >  max  y,-d.  where  d.  will  depend  on 
1  1  i  <k  1  1  * 

n  and  k.  By  letting  dj  »  da//n,  the  constant  d  is  given  by 

(2.6)  /  ♦k_1 (u+d)#(u)du  -  P*  , 

-GO 

where,  unless  otherwise  stated,  *  and  $  denotes  here  and  in  the  sequel  the 
cdf  and  the  density  of  the  standard  normal  distribution.  If  is  unknown, 
one  will  naturally  use  s  ,  the  pooled  estimate  of  o  based  on  k(n-l)  degrees 
of  freedom.  In  this  case  we  can  show  that  d  is  given  by 

oo  or 

(2.7)  /  /  1  (u*yd)ij> (u)g  (y)dudy  -  P*. 

0  •« 

where  Sv(y)  is  the  density  of  xv/M  v  »  k(n-l). 

Rirvi  (1963)  considered  the  goal  of  selecting  a  non-empty  subset  from  k 
normal  populations  so  as  to  include  the  one  with  the  largest  *  |  M ^  |  -  He 
uses  a  rule  of  the  type  R^  based  on  n.  ■  |x^|.  For  his  procedure 

OO 

(2.8)  sup  E(S)  =  ?k  /  [2  ♦(u+d)-l]k_1  d»(u),  where 

fi  0 

d  is  given  by  (2.6).  This  bound  for  E(S),  however,  exceeds  kP*. 

Suppose  the  populations  ik,  i  »  l,...,k,  have  the  continuous  distribu¬ 
tions  Fx  (x)  =  F(x/Xi),  Xi  >  0,  Xj  >  0.  To  select  a  subset  containing  the 
population  associated  with  we  define  the  procedure  as  follows: 

(2.9)  R2:  Select  w.  iff  x.  ^c'1^ 

where  x^  is  an  observation  from  and  c  >  1  is  determined  so  that  the  basic 


1 

i 


\ 


‘**1  !■«  trihrf 
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probability  requirement  is  satisfied.  It  is  easily  seen  that 

inf  P(CS|R  )  is  attained  when  X.  *...«  X.  and  the  constant  c  is  given  by 

n 

CD 

(210)  /  Fk*1(cy)  dF (y)  -  P*  . 

0 

The  rule  R2  is  monotone  and  if  the  density  f^(x)  *  ^  f(x/X)  has  a  monotone 

likelihood  ratio  in  x,  then  sup  E(S)  is  attained  when  X.  •  X.  and  is 

n  1  K 

equal  to  kP*. 

A  specific  example  of  interest  is  the  selection  from  k  gamna  populations 
with  densities 

-x/*i  r 

(2.11)  fx  (x)  «  e  x  /  T(r)  X*,  x  >  0,  Xi  >  0,  i  -  1 , .  .  .  ,k . 


In  order  to  select  a  subset  containing  the  population  with 
rule  based  on  n  observations  from  each  population)  namely, 


we  use  the 


(2.12)  R2:  Select  iff  x^  ^  b" *  max  x,  , 

1  l^j^k  J 

where  b  >  1  is  determined  so  as  to  satisfy  the  basic  probability  requirement. 
This  procedure  has  been  studied  by  Gupta  (1963).  The  analogous  problem  of 
selecting  the  gamna  population  with  the  smallest  X^  has  been  discussed  by 
Gupta  and  Sobel  (1962a,  1962b).  This  problem  arises  in  the  context  of 
selecting  a  subset  containing  the  normal  population  with  the  smallest  variance 

and  the  rule  is  an  obvious  modification  of  R2  and  is  based  on  the  estimates 

2  2 

(i»l,...,k)  of  the  population  variances  o^  using  equal  sample  site. 


For  the  problem  of  ranking  and  selection  from  normal  population  in 
terms  of  their  means,  Seal  (1  35)  considered  a  class  of  procedures  satis¬ 
fying  the  basic  probability  requirement.  Assuming  that  the  populations  have 
a  common  unknown  variance,  let  be  the  sample  means  from  the  popula¬ 

tions,  each  based  on  n  independent  observations.  Let  c  «  (Cj , . . . ,c^  j)  be  a 
vector  whose  components  are  arbitrary  non-negative  numbers  such  that 
Cj  +..,+  c,  j  =  1.  Let  x^,  <_,  ..^Xjk]  be  the  ordered  sample  means.  The 

class  C  of  rules  D  defined  by  Seal  is  as  follows: 
c 


:  Include  in  the  selected  subset  the  population  corresponding  to  x^j  iff 


(2.13)  i(i]  ic1i(1]  +  ....ci  lx[.  1)+ci;[i  +  1]  +  ...+ck_1i[k)-t(P*.c)s/^, 

where  sz  is  the  usual  pooled  estimate  of  the  common  variance  o  ,  and  t(P*,£) 

satisfying  the  P*-condition  is  given  by  the  upper  100(1-P*)  percent  point  of 

k-1 

the  distribution  of  Y  »  (  )  c.  Z,.,-Z. )/s  where  z.  =  l,...,k  are  random 

l  (i}  k'  l  '  ’ 

2 

observations  from  N(0,o  1  and  -  z^  <...<  z^k  ^  are  the  ordered 

zr-  •  •  ,zk-r 

The  rules  of  this  class  possess  certain  desirable  properties.  For  example, 
the  rule  Dc  is  unbiased,  that  is,  P{rejecting  any  population  not  having  the 
largest  mean}  >_  Pfrejecting  the  population  with  the  largest  mean}.  Also  the 
rule  has  the  property  of  gradation,  namely,  corresponding  to  any  P*.  there 


exists  a  constant  (depending  on  the  decision  rule,  the  unknown  means  and 
the  common  variance  a2)  such  that  P{retaining  the  population  with  mean  y^}  <  P* 
according  as  y^  <  Uq. 
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t; 


t  i 


I. 

i 


If  we  now  assume  that  o  is  known,  we  can  take  o=l  wi.h  no  loss  of 

generality  and  the  rule  D£  will  be  (2.13)  with  s=l.  We  define  a  subclass 

C'  of  C  by  the  restriction  c^  =  1  for  some  j  -  l,...,k-l.  The  procedure 

R(called  R^  earlier  in  this  section)  studied  by  Gupta  (1965)  is  a  member 

of  C'  with  c^  ^=1.  It  has  been  shown  by  Deely  and  Gupta  (1968)  that  the 

rule  R  has  the  smallest  expected  subset  size  among  the  rules  of  the  class  C' 

provided  that  the  parametric  configuration  is  M ant^ 

6  is  sufficiently  large.  If  we  consider  a  slippage  configuration  (w, . . .  ,y,v+6) , 

<5>0,  Seal  (1955)  shows  that  in  the  class  C  ,  the  rule  D  with  c^= . .  .=ck_j=l/(k-l) 

maximizes  (approximately)  the  probability  of  including  the  population  with  mean 

!j*-6.  Deely  and  Gupta  snow  that  E(S|R)  <  E{S | D)  except  when  6  is  near  zero. 

Seal  (1958)  defined  a  class  of  rules  similar  to  C  for  the  problem  of 

selection  from  gamma  populations  given  by  (2.11).  Let  £=  (C......C,  .)  be 

k-1 

as  before  a  vector  of  non-negative  components  such  that  £  c.  =  1.  Let 

i=l 

Xj.-.-.x^  be  a  set  of  observations  from  the  k  populations  and  ]2LX  [2]—’ ' ‘-x[k] 
be  the  ordered  observations.  Then,  in  order  to  select  a  subset  containing  the 
population  with  the  smallest  ,  Seal  proposed  the  class  of  rules  defined  below. 
D^_:  Include  in  the  selected  subset  the  population  corresponding  to  x^j  iff 

(2.i4)  l 

where  ,b  satisfying  the  basic  probability  requirement  is  given  by  upper 

k- 1 

100(1-P*)  percent  point  of  the  distribution  of  Y^/  £  ci^(i)’  w^ere  Yi,,,,Yk 

are  k  random  observations  from  a  gamma  population  with  >.=i  and  .  .<Y^_ 

are  the  ordered  Yj,...,^  Seal  (1958)  has  obtained  results  similar  to  his 
earlier  ones  for  the  class  of  rules  . 


1 


9 


3 .  General  Theory  of  Subset  Selection. 

In  this  section  we  will  describe  a  class  of  subset  selection  rules  appli¬ 
cable  to  populations  from  a  family  of  stochastically  ordered  distributions  and 
therefore  in  particular  to  populations  characterized  by  a  location  or  scale 
parameter.  Many  of  the  specific  selection  problems  discussed  in  the  subse¬ 
quent  sections  fall  under  this  general  frame  work.  We  also  discuss  a  decision- 
theoretic  formulation  of  the  problem. 

We  assume  that  1T2’''*’’,,k  ^ave  associated  absolutely  continuous 

distributions  F^  (i  =  l,...,k),  where  X^  e  A,  an  interval  on  the  real  line, 
i 

The  family  (F^),  X  e  A,  is  assumed  to  be  stochastically  increasing  (SI)  in 
X,  i.e.,  for  X  <  A'  in  A,  F^  and  F^,  are  distinct  and  F^(x)  ^F^,(x) 
for  all  x.  For  selecting  a  subset  containing  population  associated  with  X^, 
Gupta  and  Panchapakesan  (1970)  have  discussed  a  class  of  procedures  R^.  de¬ 
fined  by  a  class  of  real  valued  functions  h  =  h  . ,  c  ^  1,  d  ^  0,  possessing 
the  following  properties:  For  every  x  belonging  to  the  support  of  F'x,  (i) 
hc  d(x)  >.  x,  (ii)  hj  Q(x)  =  x,  (iii)  hc  d(x)  is  continuous  in  c  and  d,  and 
(iv)  lim  h  (x)  =  "  (c  fixed)  and/or  lim  h  ,(x)  =  ®  (d  fixed),  x  f  0.  If 

<!-*«  c»a  C-KO  c»a 

xl*  ‘ ‘  ,xk  a  set  observations  from  respectively,  the  rule 

is  defined  as  follows. 

R^  :  Include  the  population  a  iff 


(3.1) 


h(x.)  >  max  x 
1  "  I  Irik  r 


F.  ,  s  F, 
[r]  X 


(3.2) 


Letting  x^  denote  the  observation  from  the  population  with  distribution 
,  we  obtain 


[r] 


k-1 


PtCSjRj^)  =  J{  F[r](h(x)))  dFfkl(x)  . 


r*l 


[k] 
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Because  of  the  stochastic  ordering  of  (Fx),  we  can  see  that 

(3.3)  inf  P{CS|R. }  -  inf  ♦(X;  c,d,k)  , 

Q  XeA 

where  ip(A;  c,d,t*l)  is  given  by 

(3.4)  <HX;  c,d,t+l)  -  /  Fx(h(x))  dFx(x)  . 

In  all  the  specific  cases  considered  earlier  in  the  literature,  the  general 
approach  is  to  show  that  ♦(>. ;  c,d.k)  is  monotonic  in  X  and  use  this  fact 
to  evaluate  inf  <p(X;  c,d,k)  and  find  the  values  of  the  constants  such  that 
the  P*-condition  is  net.  One  of  the  main  results  of  Gupta  and  Panchapakesan 
(1970)  is  the  following  theorem  which  leads  to  a  sufficient  condition  for  the 
monotonicity  of  ip(X;  c,d,k). 

Theorem  3.1.  Let  (Fx),  XeA,  be  a  family  of  absolutely  continuous  distribu¬ 
tions  on  the  real  line  and  »Hx,A)  be  a  real  valued  function  possessing  continu¬ 
ous  first  partial  derivatives  ip  and  iK  w.r.t.  x  and  X,  respectively. 

Then,  Exip(x,A)  is  non-decreasing  in  X  provided  that 

(3.5)  fx(x)  <Px(x,A)  -  i|>x(x, a)  ^  F^(x)  1°  for  a11  x  » 

where  fx(x)  is  the  density  corresponding  to  Fx(x).  Further,  Ex<p(x,X)  is 
strictly  increasing  in  X  if  (3.5)  holds  with  strict  inequality  on  a  set  of 
positive  Lebesgue  measure. 

The  above  theorem  is  a  generalization  of  a  result  of  Lehmann  (1959,  p.  112) 
which  states  essentially  that,  if  { Fx }  is  an  SI  family  and  ^(x)  is  an  in¬ 
creasing  function  of  x,  then  Exip(x)  is  non-decreasing  in  X.  As  we  can 
see,  this  comes  out  as  a  special  case  of  Theorem  3.1,  by  letting  <p(x,X)  ■  ^(x) 
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for  all  X  and  verifying  the  condition  (3.5)  to  be  true. 

As  a  consequence  of  Theorem  1.1,  the  following  theorem  is  obtained  regarding 
the  monotonic  behavior  of  iKX  ;  c,d,k). 

Theorem  3.2.  For  the  procedure  defined  by  (3.1),  <|>(X;  c,d,k)  is  non¬ 

decreasing  in  X  provided  that 
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Theorem  3.3.  For  the  procedure  R.  defined  by  (5.1),  the  sup  E(s|R.  }  is 

n  14 

attained  when  -  ...  -  x^  provided  that  (3.9)  holds. 

If  the  condition  (3.9)  holds,  then  (3.6)  is  valid  and  consequently 
*(X;  c,d,k)  is  non-decreasing  in  X.  Thus  sup  E(S)  •  k  sup  *(X;  c,d,k)  can 

n  x 

be  evaluated.  Hence,  by  verifying  the  condition  (3.9)  we  are  simultaneously 

assured  of  the  monotonicity  of  v(X;  c,d,k),  the  fact  which  is  used  for  the 

evaluation  of  inf  P(CS | R.  )  and  sup  B(S|R.  ).  This  connection  between  the 

n  n  " 

two  has  been  observed  by  Gupta  and  Panchapakesan  (1970). 

It  should  be  pointed  out  however  that  condition  (3.6)  may  hold  without 

(3.9)  being  true.  This  is  the  case,  for  example,  when  we  consider  the  selection 

from  Cauchy  distributions  in  terms  of  the  location  parameter  using 

h(x)  ■  x+d,  d  >  0.  If  (3.6)  is  satisfied,  we  have  inf  i|)(X;  c,d,k)  • 

X 

*(X0;  c.d.k).  Then  we  can  evaluate  the  constants  because  of  the  conditions  im¬ 
posed  on  h(x)  provided  we  assume  that  F.  (x)  is  a  distribution  function  in 

0 

case  XQ  t  A. 

It  can  be  seen  that  the  above  results  are  readily  applicable  to  the  cases 
of  location  and  scale  parameters  discussed  in  section  2.  In  the  case  of  loca¬ 
tion  parameters  the  rule  defined  earlier  uses  h(x)  =  x-d,  d  ^  0,  and  in 

the  scale  parameter  case  the  rule  uses  h(x)  *  cx,  c  >  1.  In  both  the 
cases  it  is  easy  to  see  that  (3.6)  is  satisfied  and  (3.9)  reduces  to  the  condi¬ 
tion  that  the  density  f^(x)  has  a  monotone  likelihood  ratio  in  x. 

Another  case  of  importance  is  that  of  convex  mixtures  of  distributions. 


Here  the  density  f^(x)  is  of  the  form  f^(x)  *  I  w(X,j)  g.(x),  where 

j«0  ' 

gj(x)»  j  *  0,1,...,  is  a  sequence  of  density  functions  and  w(x,j)  are  non- 


negative  weights  such  that 
given  by 


l  w(Xj) 

j»0 


1.  We  assume  that  the  weights  are 
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(3.10)  w(X, j)  -  a^X^/A(X)  j  I ,  A(X)  >_  0,  X  >0 

and 

(3.11)  a^+1  •  (m+tj ) a ^ ,  j  -  0,1, . . . ;  l,  m  >_  0  . 

It  is  easy  to  see  that  A(X)  «  aQ(l-X.f)’m^,  provided  that  X  <  1  /•£.  It 

has  been  shown  by  Gupta  and  Panchapakesan  (1970)  that  the  condition  (3.9)  is 
satisfied  if,  for  a  •  0,  l,...,[i/2)  ([s]  denotes  the  largest  integer  ^  s) 
and  b  >_  1, 

(3.12)  b^V+fa)  [gl_a(x)  AGa(h(x))  -  h'  (x)g.  _a(h(x))  AG^x)] 

♦  ba(«*f(i-a))  [ga(x)  AGia(h(x))  -  h’(x)ga(h(x))  AG._a(x)] 

1  0 

where  AGa(x)  -  Ga+100  *  GaOO  • 

This  special  case  is  of  interest.  If  we  set  m  *>  1,  t  »  0,  and  aQ  »  1, 
we  get  Poisson  weights  w(X,j)  «  e  *  X^/jl.  Selection  problems  involving  non- 
central  chi-square  and  non-central  F  distributions  in  terms  of  non-centrality 
parameter  fall  under  this  special  case  and  have  been  considered  earlier  by 
Gupta  (1966b),  Gupta  and  Studden  (1970),  and  Gupta  and  Panchapakesan  (1969a). 
These  specific  procedures  are  discussed  in  Section  5.  Again,  if  we  set  l  ■  1 
and  a^j  «  1,  we  get  densities  gj (x)  with  negative  binomial  weights.  The  distri- 

bution  of  R  ,  where  R  is  the  multiple  correlation  coefficient,  in  the  so- 

called  unconditional  case  is  an  example  of  this  special  case  of  weights.  Selec¬ 
tion  procedures  involving  this  have  been  discussed  by  Gupta  and  Panchapakesan 
(1969a)  and  are  described  in  Section  5.  The  condition  (3.12)  with  b  ■  1  gives 
the  sufficient  condition  for  the  monotonicity  of  t|/(X;  c,d,k)  obtained  by  Gupta 


and  Studden  (1970)  and  Gupta  and  Panchapakesan  (1969a)  for  proper  choices  of 
weight  functions. 

Let  S'  he  the  number  of  non -best  populations  included  in  the  selected  sub¬ 
set.  Then,  for  the  procedure  R^  defined  by  (3.1),  E(S')  =  E(S'|R^)  is  given 

by  E(S')  »  pj  +  ...  +  pk  j.  Panchapakesan  (1969)  has  shown  that  sup  E(S')  is 
attained  when  the  distributions  are  identical  provided  that  (3.9)  holds. 

It  has  also  been  shown  that  the  procedure  is  monotone,  i.e.,  if 
X^  <  x ^  then  the  probability  of  being  selected  is  at  least  as  great  as 
the  probability  of  being  selected. 

In  the  case  of  absolutely  continuous  distributions  F^,  where  X  belongs 
to  a  discrete  set  of  real  nuabers,  Panchapakesan  (1970)  has  obtained  the  follow¬ 
ing  theorem  corresponding  to  Theorem  3.1  and  has  applied  it  to  the  case  of 
gamma  distributions  with  integer-valued  shape  parameters  and  common  scale  parameter. 

Theorem  3.4.  Let  (Fx>  be  an  absolutely  continuous  distributions  where 
XeA^»{Xj<X2<...}  and  <Kx,X)  be  a  real  valued  function  possessing 
continuous  partial  derivative  w.r.t.  x.  Then,  for  any  positive  integer  t, 

E^* (x ,X)  is  non-decreasing  in  X  provided  that,  for  i  -  1,2,...  , 

(3.13)  6*(x,X  )  f  (x)  -  AF.  (x)  *  (x,X  )  >  0,  j  -  i,  i  ♦  1, 

1  ^  *  J 

AlHx.Xj)  =  <J'(x,Xi  +  1)  -  iHx.X.),  AFX  (x)  =  Fx  (x)  -  Fx  (x)  . 

i  i+1  i 


where 
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Now  we  present  a  decision  theoretic  formulation  of  the  subset  selection 
problee.  Me  ere  given  k  populations  *i»***»'\  whore  is  described  by 
the  probability  space  where  belongs  to  soae  family  9. 

Ne  as suee  that  there  is  a  partial  order  relation  defined  in  9. 

Pj  y  Pj  is  equivalent  to  saying  that  P^  is  better  than  or  equal  to  P^;  or, 
in  other  words  P^  is  preferred  over  .  For  example,  if  9  is  a  one- 
paraaeter  faaily,  P^x)  »  P^j.x),  we  nay  define:  P^  >  P^  iff  6^ 

In  many  problees  >  denotes  stochastic  ordering.  Other  partial  orderings  that 
have  been  considered  are:  star-shaped  ordering,  convex  ordering,  tail  ordering. 

In  the  above  set-up,  we  assuM  that  there  exists  a  population  such 
that  for  all  1.  This  population  will  be  referred  to  as  the 

'best'  population.  In  case  of  aore  than  one  population  satisfying  the  condition 
we  will  consider  one  of  thea  to  be  tagged  as  the  best. 

Froa  each  population  we  observe  a  randoa  eleaent  The  space  of 

L 

observations  is:  Z  ■*  {x»(x]L,x2,. .  ..x^),  Xj  tZ,  i  ■  1,2,.  ...k).  In  aost 

*rk 

applications  *•  will  be  a  real  vector  space. 

The  decision  space  ®  consists  of  the  2*  subsets  d  of  the  set 
(l,2,...,k):  to  put  it  formally, 

(3.14)  *  -  (d|d£  (1,2,.. .,k>)  . 

In  other  words,  a  decision  d  corresponds  to  the  selection  of  a  subset  of  k 
populations . 

A  decision  d  c  *  is  called  a  correct  selection  (CS)  if  j  t  d  which 
means  that  the  best  population  is  included  in  the  selected  subset  d.  It 
should  be  pointed  out  that  in  many  subset  selection  procedures  investigated  earlier, 
the  null  set  4  is  excluded  froa  *  to  guarantee  the  selection  of  a  non  empty 


subset . 
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A  Muuribl*  function  6 
cedure  provided  that  for  aach 


dafinad  on  $  x  *  it  called  a  selection  pro 
x  t  we  have. 


(3.15) 


6(x,d)  >0 

I  «(x.d)  -  1 


and 


where  6(x,d)  denotes  the  probability  that  the  subset  d  is  selected  when 
x  is  observed.  The  individual  selection  probability  p^ (x)  for  the  popu- 
let ion  is  then  given  by 


(3.16) 


Pi (5)  "  d  l  A  *(i»d)  * 


where  the  s unnation  is  over  all  d  containing  i.  If  the  selection  probabili¬ 
ties  Pj(i).  P2 (*)»•••»?![ C*)  take  on  only  the  values  0  and  1,  then  the 
selection  procedure  6(x,d)  is  completely  specified. 

In  general,  we  can  asstmte  that  the  selection  of  a  subset  di*  results 
in  a  loss.  Let  us  consider  the  situation  where  p^  *  p(6^,x)  and  assume 

the  loss  L(e.d)  «  L((61(62, . . .  ,ek)  ,d)  •  l  Lj  (6)  where  Lt(e)  is  the  loss 

i  ed 

if  the  ith  population  is  selected.  We  may  assume  an  additional  loss  L  if 
a  correct  selection  is  not  made.  The  overall  risk  for  the  nanrandamised  rule 

6  is: 

k 

(3.17)  *(6,6)  -  Li(9)  EgPj(x)  ♦  111  Pe<(CS|6}]. 


In  many  problems  it  has  been  assumed  that  1^(9)  -  1  and  L  *  0,  in 
which  case,  R(0_,6)  gives  the  expected  site  of  the  selected  subset.  In 
general,  our  aim  is  to  minimize  the  risk  R(£,<5)  which  will  be  done  under 
the  usual  symmetry  condition. 
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Our  goal  is  to  obtain  selection  rules  i  selecting  a  non-empty  subset  and 
satisfying  the  P*-condltion.  In  general,  we  wish  rules  with  large  probability 
of  a  correct  selection  and  a  small  value  of  the  expected  site.  The  ratio 
V6)  -  k  PW(CS|6)/EU{S|6)  can,  among  others,  be  considered  as  a  measure  of 
the  efficiency  of  the  procedure  6  at  u  ™  (P^ , . . .  ,P^) ,  P^  e  9.  Both 
PW(CS|«)  and  Ew(sU)  depend  on  6  only  through  the  individual  selection 
probabilities  and  hence  if  we  restrict  our  attention  to  these  quantities,  we 
can  define  two  rules  6  and  6*  as  equivalent  if  they  have  the  same  individual 
selection  probabilities  p(x)  and  p' (x)  for  all  x.  Hence,  we  can  use  the 
following  simplified  definition,  replacing  6  by  R. 

k  k 

A  subset  selection  rule  R  is  a  measurable  mapping  from  %  into  E  (k 
dimensional  Euclidean  space),  namely, 


R:  x-*-  (p1(x),p2(x),...,pk(x)),  O^p.(x)  ^1  , 

i  «  1,2 . k  . 

If  p^’s  are  0  or  1,  the  rule  is  nonrandomized;  in  this  case,  R  can  also 
be  defined  by  the  sets  ^  ■  (x  c  |p^(x)  •  1),  i  ■  1 ,2 , . . .  ,k.  is  the  set 

of  observations  for  which  is  selected.  R  is  said  to  be  unbiased  iff 

i,  >■  it,,  i  *  1,2,  —  ,k  •  P  ,>P  .  for  all  weft 

j  i*  -  w,i 

where  ^  ■  E^p^Cx)  *  probability  that  ik  is  selected,  and  is  said  to  be 
monotone  iff 

«,  >  it.  *  P  .  >  P  .  for  all  i,j  and  all  w  e  ft  . 
j  i  w,j  -  U!,i 

Me  shall  restrict  ourselves  to  selection  rules  R  which  are  invariant  under 
permutation  (or  symmetric),  i.e.,  rules  R  for  which 
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(p^gx).-  •  ..pk(gx))  ■  g(p1(x),---»pk(x))  for  »11  w*V,  g£G 


where  G  denotes  the  group  of  permutations  g  of  the  integers  l,2,...k. 

Studden  (1967)  has  discussed  the  problem  of  obtaining  optimal  procedures. 
He  has  obtained  a  necessary  and  sufficient  condition  that  a  rule  6  be  best 
invariant,  that  is,  6  is  an  invariant  rule  for  which  R((^,6)  is  minimum. 
Assume  to  be  those  permutations  of  (0j,...,0k)  such  that  the  largest 

parameter  value  is  in  the  last  component  and  let 

$. (x;0)  •  (1/ (k-1) I )  l  f(x,g6),  i  ■  1, . . . ,k  where  f(x,0)  is  the  joint 

Gi 

density  of  x(w.r.t.  some  measure  y)  and  ■  (g|g  1k  ■  i).  The  following 
theorem  has  been  proved  by  Studden. 

Theorem  3.5.  A  selection  rule  6  is  best  invariant  iff 


k 

(3.18)  pk(x)  =  1  if  Uk(x;0)  >  l  1^(0)  <t>i(x;e) 

k 

«  0  if  L*k(x;6)  <  l  1.(0)  4>i(x;£) 
i»l 

for  6  e  almost  everywhere  y.  The  functions  p^(x),  ir<k,  are  defined 

by  the  invariant  conditions  on  p(x)  ■  (p,  (x) , . . . ,p.  (x)) .  As  a  corollary,  we 

k 

obtain  the  result:  An  invariant  selection  procedure  minimizes  J  L. (0)  E.p. (x) 

i-1  - 

subject  to  the  condition 

(3.19)  P0{CS| 6}  >_y(L)  for  all  0  e  ft 

iff  the  individual  selection  probabilities  are  determined  by  (3.18). 

The  expression  given  in  (3.18)  defining  the  selection  probabilities  which 
minimize  R(£,6)  is  rather  complicated  when  written  down  in  terms  of  the 
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original  densities.  However,  for  the  slippage  situation  when  the  underlying 
densities  are  from  an  exponential  family  and  L^(8)  =  1,  the  expressions 
simplify  considerably  and  in  this  case  the  following  theorem  has  been  obtained 
by  Studden. 

k  ex 

Theorem  3.6.  Let  f„(x)  •  J1  f„  (x, )  where  f„(x)  ■  C(6)e  and 

- -  —  -  e  — '  ,  .  w,  i  e 

—  l 


8  ■  8m  *  *m 


6[k  ■  6 -  &  (A  >  0) .  An  invariant  rule  6  mini¬ 


mizes  E  { S 1 6 >  subject  to  the  condition  that  Pfi{CS|6)  £y  iff  for  almost 


all  x 


(3.20) 


k-1  Ax.  Ax. 

P.  (x)  -  1  if  l  e  1  <  Ce  K 
K  ~  i-1 

k-1  Ax,  Ax. 

-  0  if  l  e  ^  >  Ce 
i-1 


Studden  also  considered  a  simple  situation  concerning  normal  populations 

where  the  parameters  are  permitted  to  vary.  It  is  assumed  that  f(x;8)  - 
k 

n  f(Xj-e^)  where  f(x)  is  the  standard  normal  density.  For  fixed  A  let 
i-1 

p (x^; A)  denote  the  selection  probabilities  defined  by  (3.20)  where  C  is 
chosen  so  that  Pe(CS|p(£,A)}  =  y  for  all  £  =  (8, . . . ,6,8+A) .  Let  4>(A)  denote 
the  class  of  invariant  procedures  satisfying 


(3.21) 


P  (CS|<5)  >_  y  for  all  £  e  0(A) 


where  0(A)  *  {£  I  —  e[2]  —  —  6[k-l)  —  9(k)  ~  ^  * 

Theorem  3.7.  For  any  £  with  8rj,  =  =  ...  =  8^_i]  =  “  A  tl,e 

minimum  value  of  Efl{S|<$)  over  the  class  «(A)  is  attained  by  p(x;A),  i.e.. 


min  E  (S| 6}  =  E  {S|p(x;A)}  . 

♦  (A)  i  i  ~ 


=  tea 


(3.22) 


‘'UtU 
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Now,  consider  the  sequence  of  selection  probabilities  defined  for 
A  f.  (0,“)  by 

k-1  Ax.  Ax. 

(3.23)  ?v(x;A)  =  1  if  l  e  1  <  C(A)e 

K  "  i=l 

k-i  Ax.  Ax. 

-  0  if  l  e  >  C (A)e  *  . 
i=l 


For  A  =  0  we  let 

k-1 

(3.24)  pk(x;0)  =  1  if  ^  x../(k-l)  <  xk  ♦  C(0) 

k-1 

=  0  if  l  x./ (k-1)  >  x.  ♦  C(  )  , 
j=l  3 


while  for  A  =  ■»  we  define 


(3.251  t),  (x;*>)  =  1  if  max  x.  <  x.  >  C(<») 

‘K  IJik-l  J  K 

=  0  if  max  x.  >  x,  ♦  C(°°)  . 

l<j<k-l  3 


The  values  C(A),  u  c  [0,°°]  are  all  chosen  so  that  for  a  fixed  set  of  values 


0[lj  L  •  •  •  1  8[k]’  the  Pr0&abilit>'  oi  a  correct  selection  is  equal  to  a  given 
value  y  The  rules  defined  in  (3.24)  and  (3.2S)  have  been  considered  by 


several  authors.  It  has  be.*n  observed  by  Studden  that  p.  (x;A)  has  limits 
Pk(x;0)  and  pk‘'x;“>)  almost  everywhere  ;j  as  A  approaches  zero  and  infinity, 
respectively. 

In  addition  to  several  desirable  properties  and  criteria  for  selectioi'  rules 
discussed  above,  another  concept  was  investigated  by  Nagel  (1970).  This  is  con¬ 
cerned  with  what  are  called  ’just"  selection  rules. 
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We  assume  that  a  partial  order  relation  >  is  defined  on  %  >  x  or, 

equivalently,  x  <  means  that  £  is  better  than  x  ] .  A  selection  rule  R 
defined  by  its  individual  selection  probabilities  p^(x),  i  *  is  said 

to  be  just  iff 


(3.26) 


*  Pi  <X)  I  Pi  (*)  • 


For  nonrandomlzed  rules  determined  by  acceptance  regions  we 

can  define  a  just  rule  equivalently  in  terms  of  increasing  sets.  A  subset 
A  c  is  said  to  be  increasing  iff  x  e  A  and  jr  >  x_  =*  £  e  A.  We  say 
that  P  is  svochvstically  better  than  Q(P  >  Q)  iff  P(A)  Q(A)  for  all 
increasing  sets  A  e  6.  We  note  that  if  X  is  the  real  line  and  >  stands 
for  >ior  >)  then  the  increasing  sets  are  the  intervals  [a,°°)  and  (a,»>) 
which  induce  the  usual  stochastic  ordering  on  the  distribution  functions.  A 
rule  R  is  saii  to  be  just  iff 

implies  £  e  A^  . 


x  e  Ai 


x.  <  y. 
l 


V  j  * 1 


As  mentioned  earlier,  frequently  we  require  a  selection  rule  to  satisfy 
the  basic  probability  requirement.  Hence,  a  central  problem  in  the  subset 

selection  theory  is  to  determine  inf  P  {CS|R}.  For  many  rules  investigated 

weft  ^ 

in  the  literature,  this  infimum  is  attained  in  where  c  n  is  the  set 

of  w  whore  the  are  identical.  This  could  reasonably  be  expected  of  a 

good  rule,  because  in  no  statistical  information  can  be  employed  to  find 
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the  arbitrarily  tagged  population.  It  has  been  proved  by  Nagel  (1970)  that  this 
property  holds  for  a  just  selection  rule  i.e., 

(3.27)  inf  PCSjR}-  inf  PJCSlR),  if  R  is  just  . 

uefl  u)5(1q 

It  is  also  a  reasonable  requirement  that  P^tCSjR}  be  constant  over 
because  in  stating  the  P*-condition,  we  express  that  we  are  content  if 
P  (CS|R)  is  at  least  P*  and  we  are  not  interested  in  exceeding  P*,  at 
least  not  in  where  it  can  he  achieved  only  by  increasing  the  expected 
nunbe'*  of  populations  in  the  selected  subset. 

The  following  lema  can  be  applied  to  construct  just  subset  selection  rules 
with  constant  probability  of  a  correct  selection  in 

Lemma  3.1.  Let  Xj ,  X2,...,Xk  be  independent  and  identically  distributed 
random  variables  with  joint  distribution  P0.  Let  T(X^,  X2,...»Xj[)  be  a 
sufficient  statistic  for  6. 

(i)  If  E(6(X1,...,Xk)|T)  -  P*  for  all  T  then  E06  =  P*  for  all  8. 

(ii)  If  T  is  complete  w.r.t.  (P0(x)},  then  E0(6(Xlf . . . .X^) |T)  *  P* 
is  also  necessary  for  E0o  =  P*  for  all  8. 

Gupta  and  Nagel  (1971)  have  investigated  the  problem  of  constructing  just 
rules  in  the  cases  of  sane  discrete  distributions  such  as  binomial,  Poisson  and 
negative  binomial  distributions,  which  are  discussed  in  the  next  section.  They 
have  also  discussed  the  problem  of  deriving  rules  with  constant  P(CS|R)  in 
nQ  using  the  likelihood  ratio  criterion.  They  consider  densities 
f(xi,8i),  i  =  l,...,k,  where  f(x,8)  is  given  by 

6T(x) 


(3.28) 


f(x,0)  =  c(9)  e 


h(x)  . 


Under  the  slippage  configuration,  they  derive  the  rule 

R:  Select  u.  iff  T.  >  Tr.  ,  -  c 
l  1  —  [k] 

where  c  ■  c(k,  P*.  0,  6)  is  determined  from  the  P*-condition.  This  rule  is 
just  and  the  constant  c  is  given  by 

(3.29)  /“  Ge_1  (t  +  c)  dG0(t)  =  P* 

—  eo 

where  G.  is  the  cdf  of  T.  For  the  normal  distributions  with  0  as  the  location 
parameter,  is  independent  of  0.  In  general,  c  depends  on  0  and,  if  6  is 
not  known,  an  estimator  of  0  may  be  used.  Since  n\  is  a  sufficient  statistic 
for  0,  this  yields  a  selection  rule  of  the  form 

(3.30)  Select  n.,  iff  T.  >  T[k]  -  c(ET.,P*)  . 

By  Lemma  3.1,  this  rule  has  constant  probability  of  a  correct  selection  in 
if  c(ETj ,P*)  is  determined  to  satisfy 

(3.31)  P  (Tj  iT[k]  -  c(FT^,P*)  | ETj)  =  P* 

for  all  ET. ,  eft...  However,  it  is  now  known  whether  (3.30)  is  a  just  rule. 
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4.  Selection  from  Discrete  Populations 

In  this  section  we  discuss  the  results  of  investigations  of  procedures 
for  selection  from  k  independent  discrete  populations.  Though  selection  of 
the  multinomial  cell  with  the  largest  (smallest)  probability  where  the  obser¬ 
vations  are  on  integer  valued  random  variables  falls  under  this  category,  we 
discuss  it  in  the  next  section  along  with  problems  concerning  multivariate 
normal  populations.  The  case  where  only  the  ranks  of  the  observations  are 
considered  is  discussed  in  the  section  on  distribution-free  procedures.  Our 
present  discussion  will  be  mainly  concerned  with  selection  from  binomial, 

Poisson  and  negative  binomial  populations. 

Binomial  Case: 

We  have  k  independent  binomial  populations  ir  (i«l,... ,k)  with  unknown 
probabilities  of  success  on  a  :»ingle  trial  9j,...,G^  respectively,  where 
0  <_  (T  <_  1,  i  =  l,...,k.  The  following  procedure  R  based  on  samples  of  size  n 
from  each  population  has  been  proposed  by  Gupta  and  Sobel  (1960) . 

R:  Select  the  population  ik  iff 

(4.1)  >_  max(x1, . . .  ,xk)-d 

where  x.  is  the  observed  number  of  successes  in  n  observations  from  it.  and 
l  i 

d=d(n,k ,P*)  is  the  smallest  non-negative  integer  that  will  satisfy  the  P*-condi- 
tion . 

It  is  known  that  P { CS | R )  is  minimized  when  9^=...=©^.  Thus,  the  integer  d 

is  the  smallest  non-negative  integer  for  which 

n  a+d  .  .  , 

inf  l  (n)9a(l-0)n'a(  l  (")9J (l-0)n_J ]k~ 1 
O^i1  a=0  °  J"° 


(4.2) 


>  P*  . 
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The  above  procedure  and  another  procedure  for  the  case  of  samples  of 
unequal  sizes  along  with  the  normal  approximations  for  both  these  cases 
have  been  discussed  earlier  in  the  literature  and  have  been  briefly 
summarized  by  Gupta  (1966a).  It  has  been  shown  by  Gupta  and  Sobel  that 
for  k=2,  the  infimum  in  (4.2)  is  attained  for  6=1/2,  and  that,  for  a  fixed 
k,  the  value  0O  at  which  the  infimum  takes  place  tends  to  1/2  as  n-**». 
However,  in  general,  the  value  of  6  for  which  the  infimum  takes  place  is 
not  known.  When  0,  = . . . "9j.=6,  P(CS|R)  can  be  written  as  a  polynomial  of 
degree  nk  in  0.  Let 

nk 

(4.3)  P(CS|R)  =  Qk  (fl)  =  l  c.(k,n,d)e1  . 

’  ’  i=G 


The  minimum  of  Q,  .(0)  is  attained  for  some  9„,  0  <  9„  <  1  for  which 
xk,n,dv  0  0 

JqJq.q  =0.  Nagel  (1966)  has  evaluated  the  coefficients  c^k.n.d)  numerically 

for  k=2(l)7,  n=2(l)7  and  d=0(l)n-l.  It  is  found  that  the  first  derivative  is 

of  the  form 


(4.4)  a§  =  [0(l-e)]d"1T(9)  | 

i 

where  T(0)  is  a  polynomial  in  9.  The  computations  showed  that  Q(0)  may  have 

several  minima  in  (0,1).  A  table  of  Q  values  is  given  for  a  few  selected  j 

> 

* 

values  of  k  and  n.  ; 

i 

Gupta  and  Nagel  (1971)  have  constructed  a  rule  RQ  for  the  above  binomial 
problem  which  overcomes  the  difficulty  of  finding  the  infimum  of  the  probability 
of  a  correct  selection.  Their  goal  is  to  construct  a  just  rule  such  that 
P  (CSlR)  =  P*  for  all  m  e  Qn,  where  fin={u:w= (9, . . . ,6) } .  It  is  clear  that 
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this  goal  cannot  be  achieved  with  a  non randomized  rule,  because  when 
w=(0,...,0)  or  (1 , .  . . ,  1)  the  observations  will  be  x-(0,...,0)  or  x»(n,...,n) 
with  probability  1,  requiring  the  use  of  individual  selection  probabilities 
PA  (x)=P*. 

The  joint  density  for  u  €  !JQ  is 

K  k 

(4 ■  S)  fu)(xi’x2*'  •  *  *xk}  x  (l-©)nkexp[  (Z  x.)  log  n  ("  )  . 

—  Ill 

k 

We  see  that  T  =  Y  X.  is  a  sufficient  statistic  for  9.  Since  we  are 
i  =  l  1 

interested  in  symmetric  rules  R  it  is  sufficient  to  know  one  of  the  individual 
selection  probabilities,  say,  p^.  From  Lemma  3.1  it  follows  that 


(4.6) 


E(Pk(X)|T)  =  P*  for  T  *  0, 1 , . . . ,kn. 


The  requirement  that  R  be  just  leads  to 


Figure  1  shows  the  partial  ordering  induced  by  (4.7)  among  the  observation 
vectors  for  the  case  k=3,  n=2.  The  individual  selection  probability 
p^(xl ,x2,Xj)  defines  a  just  rule  if  its  values  are  nondecreasing  in  the 
direction  of  the  arrows.  Because  of  symmetry  only  one  of  the  two  permu¬ 
tations  (Xj.x^.Xj)  and  (x^.x^.x^)  is  plotted.  The  numbers  underneath  the 
observation  vectors  denote  the  corresponding  T  values. 


(yw 

(0  o0°) 


(0  l.of 

t 

(0  220)' 


(0  022] 


r(°  23^; 


(1  i2or 

t 

*d  230)' 


^(0  2a2)^^ ^J[l l  142) 


0  >J» 

:a  2.i)' 


(2  240) 


T1  2S2) 
(2  262) 


2*2SD 


Figure  1.  Partial  Ordering  for  Binomial 
Observations  k-3,  n«2. 


The  conditions  (4.6)  and  (4.7)  do  not  determine  a  rule  uniquely. 


Gupta  and  Nagel  have  proposed  the  following  rule  R^: 


(4.8)  p  (x)  = 


1  if  *k  >  CT 

p  if  xk  =  CT 

0  if  <  cT 


where  p  =  p(T,P*,k)  and  c.(.  =  c^fP^k)  are  determined  to  satisfy 
(4.9)  E(pk(X)|T)  =  P{XR  >  cTlT)  ♦  pP(Xk  =  cTjT)  =  P*  . 

The  conditional  distribution  of  Xk  given  T  is  hypergeometric: 


(4.10) 


P(Xk  =  i|T>  = 


rn.  .(k-l)n. 
V  1  T-i  1 
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Let  Z.j.  have  the  same  distribution  as  given  T.  Then  (4.9)  becomes 

(4.11)  P{Zj  >  +  pP(ZT  «  c^.}  ■  P* 

and  the  constant  cT  is  smallest  integer  determined  from  the  inequalities 

(4.12)  P(ZT  >  cT>  <_  P* 
and 


(4.13) 


P(ZT  1  cT)  >  P* 


From  (4.11),  we  have 


(4.14) 


P*  -  P(ZT  >  Cj.) 
P(ZT  »  a,.} 


It  has  been  established  by  Gupta  and  Nagel  (1971)  that  the  above  rule 

Rq  is  just.  They  have  also  tabulated  the  values  of  c„  and  p  for  k=2,3,5; 

n=S,10  and  P*=.75,  .90,  .95,  .99,  in  each  case  T  going  from  0  to  nk. 

Since  T  takes  on  the  values  0,1,..., kn  these  tables  become  very 

extensive  for  large  values  of  k  and  n.  Therefore  it  is  desirable  to  find 

approximations  for  c,j,  and  p.  The  normal  approximation  for  the  hypergeometric 

distribution  gives  good  results  when  n  is  large  and  T  is  not  extreme  (close 

to  0  or  kn) .  The  expectation  and  variance  of  Z„  are  y  =  J-  and  ^  ^  ^ 

(kn-l)k 

2 

respectively.  Using  the  fact  that  asymptotically  is  N(y,o  ),  we  obtain 
approximate  value  c^,  given  by  c^,  =  [j  +  y  -  a  4>"*(P*)]  where  is  the 
inverse  of  the  standard  normal  cdf  and  [x]  is  the  integral  part  of  x.  For 
p  we  get  the  approximate  value  p  =  c^.  +  0.5  -  (y  -  at  *(P*)).  The  exact  and 
approximate  values  of  c^.  and  p  have  been  compared  by  Gupta  and  Nagel  for 
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\ 


k*2,3,5,lC>;  n»S, 10,20;  and  some  selected  values  of  T  and  P*.  The  results 
show  no  change  in  the  values  of  c^  and  c^,  and  only  small  deviations  in  the 
values  of  p  and  p. 

The  nonrandomized  version  Rq  of  R^,  namely,  R^:  Select  it.  iff  xi  >_  c^,, 

is  conservative  in  the  sense  of  meeting  the  basic  probability  requirement. 

However,  Rq  may  not  be  just  and  it  selects  large  subsets  if  the  G^'s  are 

close  to  zero  or  one.  A  comparison  of  RQ  and  R  is  difficult  because 

inf  P  { CS | R }  is  not  known  in  the  case  of  R.  Since  it  takes  place  near 
n  1  " 

6  «  ^  the  P*-value  for  R^  has  been  chosen  by  Gupta  and  Nagel  to  satisfy 
PJCS|R}  =  P*  with  w  =  (i-.-j, . . .  ,y)  which  makes  the  comparison  slightly  more 
favorable  for  R.  Under  slippage  configuration  (9,...,0,  9+6),  the  numerical 
computations  show  that  RQ  yields  better  results  for  small  values  of  6,  while 
R  is  better  for  large  6.  Hence  R^  should  be  applied  if  small  differences  in 
the  success  probabilities  are  expected.  This  advantage  of  RQ  becomes  more 
evident  in  the  case  of  equally  spaced  configurations,  where  almost  surely 
more  than  half  of  the  populations  will  be  retained  in  the  selected  subset 
if  the  number  of  observations  is  increased  indefinitely,  whereas  R  will 
eventually  select  only  the  best  one. 

Gupta  and  Nagel  (1971)  have  studied  rules  similar  to  R^  defined  by 
(4.8)  for  the  problem  of  selection  from  Poisson  and  negative  binomial  distri¬ 
butions.  The  case  of  Fisher's  logarithmic  distributions  has  been  discussed 
by  Nagel  (1970). 

In  connection  with  selection  from  discrete  populations  Nagel  (1966) 
considered  the  problem  of  minimizing 
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(4.15) 

11  L  1 

A  ■  I  «j(  I  Vk-‘ 
i=0  j  =0  3 

under  the 

condition 

(4.16) 

n 

l  a  =  1 ,  a  >  0  for  i  <*  0, . . . ,n  . 
i-0 

Setting 

(4.17) 

i 

A.  =  I  a.,  i  =  0,...,n;  A.  =  0,  i  <  n;  A.  *  A  ,  i  >  n, 
1  j=0  J  1  1  n 

we  have 

(4.18) 

n 

A  =  Y  (A.  -  A.  ,)  A.  .  . 

1  l-l  l*d 

For  d  =  0,  it  has  been  shown  that  the  minimum  of  A  is  given  by 


<419>  A»in  (k’n)  TTV 

If  bk  -  (k-l)/kk/k'lf  then 


(4.20) 


A  (k ,n+l)  =  1-b. /  (A  .  (k,n)) 
min '  '  k  mm v  *  1  ' 


1/k-l 


A  .  (k,n)  has  been  tabulated  for  k=2(l)8  and  n=l(l)25.  The  case  of  d  >  0 
mm 

can  be  handled  using  the  results  for  d  =  0  case. 
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5 .  Selection  Procedures  for  Multinomial  and  Multivariate  Normal  Distributions. 
I .  Multinomial  Case. 

Let  ,p7, . . . ,p^  be  the  unknown  cell-probabilities  in  the  multinomial 


k 

distribution  with  £  p.  »  1.  Let  x.  ,x7, . . .  ,x.  be  the  respective  observa- 

.  1  i  1  1  K 

k 

tions  in  the  k  cells  of  the  distribution  with  7  x.  =  N.  Let  the  ordered 

X  1 


cell-probabilities  be  given  by  p^  <_  p^  For  selectin8  a 
subset  of  the  cells  containing  the  cell  associated  with  p^,  Gupta  and 
Nagel  (1967)  proposed  and  investigated  the  following  procedure 


R.  ;  Select  the  cell  with  observed  x.  iff 
1  i 

(5.1)  xi  >_  max(x.,. . .  ,X-  )  -  D 

where  D  is  a  given  non-negative  integer.  Using  this  rule  the  probability  of 
a  correct  selection  j.s  given  by 

(5.2)  P(CS | Rj }  «  F(k,N,D;  P(1] » • • • ,P[kj) 

V  N!  V1  _vk 

'VS  V'V  Pl*l  'p[U 

v.  <v.  +D 

l—  k 

i  =  1,2,.  ..,k 

Then  the  following  lemma  can  be  established. 

Lemma  5.1.  (i)  If  the  sum  Pj^j  *  P[j]>  1  i  <  j  <  k,  is  kept  constant, 

P{CS  |  Rj }  decreases  as  we  pass  from  the  configuration  (pj  ^ ] , . . .  ,P^j ,  •  •  • , 

pLj] . p[k]}  t0  Cp[i) . p[i]  '  e . P[j]  +  e . P[k)}  where 

0  <  e  <  P(i]. 
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(ii)  If  the  sum  p^j  +  ^[k]'  1  ±  i  <  k,  is  kept  constant, 
decreases  as  we  pass  from  the  configuration  (Pf jj *  * • • »Pf jj « • • • » 
(Pfij.---.Pfi]  +  e  *  •  •  '  ’Pfk]  "  £)  where  0  <  s  IP[k]- 
By  using  this  lemma,  the  following  theorem  is  obtained. 


P{CS | Rj  > 

f[k]>  t0 


Theorem  5.1.  Let  y  be  the  smallest  integer  such  that  p,  ,  >  0  and  let  v 
- -  *lw] 

be  the  largest  integer  such  that  p^j  <  Pf^j-  Then,  for  a  configuration 
minimizing  P{CS J } ,  y  >_  v.  In  particular,  if  y  =  k-1,  then  y  >  v. 

As  a  consequence  of  the  yuo.v-  theorem,  we  have 


(5.3)  inf  P (CS | R  }  =  min  (.  min  .  F(k ,N ,D; (0, . . . ,0,s ,p, . . .  ,p)) 

«  r=2, .  .  .  ,k  -<  p  <  -p-y 

where  s  =  1  -  (r-l)p  and  ft  is  the  space  of  all  configurations  of  p^ , ... ,p^ . 

Foi  the  purposes  of  computations  it  is  not  necessary  to  consider  the 

cases  where  r  <  k,  when  the  problem  is  already  solved  for  all  smaller  values 

of  k  for  the  same  N  and  D.  In  other  words,  we  need  consider  only  vectors  of 

the  type  (s,p,...,p),  s  =  1  -  (k-l)p.  On  the  basis  of  numerical  evaluations 

of  F (k,N,D;  (s,p,...,p))  done  for  D  =  0(1)4,  k  =  2(1)10  and  N^2(l)15,  it  was 

found  that  the  minimum  over  p  took  place  either  for  p  =  y  or  for  p  = 

except  in  the  case  of  k  =  3,  N  =  6  and  D  =  4  for  which  the  minimum  was  attained 

in  the  interior  of  the  interval  (7-  ,  — r-)  • 

k  k-1 

Consider  the  configuration  (p , . . . ,p ,Ap) ,  A  >_  1 .  For  any  D,  the  expected 
subset  size  is  given  by 


E(S)  =  l 


N! 


Zv.--N  1 
1 


V,  !  .  .  .  V, 


•Pkk  BV 


(5.4) 
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where  =  number  of  v^'s  >_  vmax  -  D.  The  probability  cf  selecting  a 

non-best  population  is  given  by  — —  -  Tables  have  been  provided 

by  Gupta  and  Nagel  (1967)  for  the  values  of  P{CS,P.j},  expected  proportion 

of  cells  selected  and  the  probability  of  selecting  a  non-best  population 

corresponding  to  the  configuration  fp,. . . ,p,Ap) ,  A  >_  1  for  k  =  2(1)10, 

N  =  2(1)15,  A  =  1(2)5  and  D  =  l‘fl)2.  Another  table  gives  the  minimum  D 

such  that  inf  P(CS | Rx >  >_  P*  for  k  =  2(1)10,  N  =  2(1)15  and  P*  =  .7S,  .90. 
fi 

For  selecting  a  subset  containing  Gupta  and  Nagel  investigated 

the  rule  R.,  which  selects  the  cell  with  observation  x.  iff 

2  l 

(5.5)  xA  <_ min(xlt. .. .x^)  +  C 

where  C  is  a  given  non-negative  integer.  In  this  case  the  probability  of  a 
correct  selection  is  given  by 


(5.6) 


P{CS ) R2>  *  G(k,N,C;  p^ j » • • • »P j i 

l 


N!  V1  vk 


Iv 


=n  vr---vk!  ‘W'W 


I  Vc‘  J  = 


The  following  lemma  has  been  proved. 


Lemma  5.2,  (i)  If  the  sum  p^. ^  +  P[jJ’  *  <  *  <  j  k,  is  kept  constant, 

P { CS | R2 )  decreases  as  we  pass  from  the  configuration 

(p[l]’,,‘’P[i],,'-'p[j]’*,*’p[k]>  t0  (P[i]'-”'P[i]-E'---'P[j]+e*---«P[k]) 

where  0  <  e  <  pr . , . 


S 


* 

5 

i 

1 


i 

I 

s 


$r  tS&M  #  S*  m  •< 
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(ii)  If  the  sum  p^j  +  P[i]»  1  <  3  1.  is  kept  constant,  P(CS|r2} 
decreases  as  we  pass  from  the  configuration  (p j  , . . - »P[j] » • *  *  *P[k])  t0 
(P[1]*e,...,pjjj-e,...,pjkj)  where  0  <  e  1  P[jj- 
As  a  consequence  of  Lemma  5.2  the  following  theorem  is  obtained. 

Theorem  5.2.  P(CS|r2)  is  minimized  at  a  configuration  (Pji] » ■  •  •  »P[jt]) 
given  by  (p,...,pfq),  where  q  =  1  -  (k-l)p,  0  <  p 

Numerical  evaluation  of  G(k,N,C;p, . . . ,p,q)  for  k  =  2(1)10,  N  =  2(1)15 
and  C  =  0(1)4  show  that  the  overall  minimum  is  given  by  the  configuration 
(r-, .  . .  ,r-) .  For  the  configuration  (p/A,p, . . .  ,p)  ,  A  >_  1,  tables  are  available 
for  the  expected  proportion,  P(CS|R2>  and  the  probability  of  selecting  any 
fixed  cell  >:ith  probability  p  for  k  =  2(1)10,  N  =  2(1)15,  A  =  1(2)5  and  c  =  0(1)2. 

As  we  have  seen  above,  Gupta  and  Nagel  procedures  are  based  on  a  fixed  sample 
size.  For  the  problem  of  selecting  the  cell  with  Panchapakesan  (1971) 

proposed  a  procedure  Rj  which  is  based  on  inverse  sampling.  Observations  are 
taken  one  at  a  time  until  the  count  in  any  cell  reaches  a  given  number  M.  Let 
Xj,  x2,...,xk  be  the  cell-counts  at  termination.  Then  Rj  is  defined  as  follows: 

R^:  Select  the  cell  with  count  x^  iff 
(5.7)  xi  >  M  -  D 

where  D  is  a  non-negative  integer.  For  the  rule  R^  the  probability  of  a 
correct  selection  is  given  by 


(5.8) 


k-1 

P{CS | R,}  =  1  -  l  L 


where 
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I  (5-9) 


L 

a 


I 


vl!“'vk! 


vk 

p[k]‘ 


the  summation  being  over  the  set  of  values  of  such  that 

va=M«0  £  «  M-P-l  and  0  <_  Vg  ^  M-l ,  8  *  1, . . .  ,k-l;  8  +  a.  This  multiple 

sum  can  be  expressed  in  an  integral  form  and  we  get 


(5.10) 


p{cs|r3>  =  1 


r((k-l)M»M') 
[r(M)]kr(M')  1 


where  M' 

(5.11) 


M-D, 


T  *  /  •  . 

a  i 

h. 


•  / 


,k:2  M-K  M'-l 

(  n  y  )  y 

i  =  l 


h  (1*v-*vi) 


(k-l)M+M 


,  dy....dy 


k-1 


and  ^  =  p^,,  i  =  l,...,k. 

It  has  been  established  by  Panchapakesan  that  the  statement  of  Lemma  5.1 
holds  in  the  case  of  Rj,  and  hence  that 

(5.12)  inf  P{CS | R, >  *  min  (.  min  .  F(k,M,D;  (0, . . . ,0,s,p, . . .  ,p)) 

«  r=2,. . .  ,k  i  <  p  <^3- 

where  0  is  the  space  of  all  configurations  of  the  cell-probabilities,  r  is  the 
number  of  positive  cell-probabilities  in  the  configuration 
(0 .... ,0,s,p ... .  ,p) ,  0  <  s  <^p,  and  F(k,M,D;  (0, . . .  ,0,s  ,p, . . .  ,p))  is  the 
probability  of  a  correct  selection  for  this  configuration.  Subject  to  the 
condition  that  s  +  (r-l)p=l,  it  has  been  shown  that,  for  every  fixed  r, 
P{CS|Rj)  increases  in  p  and  hence 

inf  P{C3|r.}  =  min  F  (k,M,D) 
fl  r=2 , . . . ,k 


4 


% 

% 


4 

1 

i 

\ 

■t 


\7# 


(5.13) 
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where  Fr(k,M,D)  denotes  the  probability  of  a  correct  selection  for  the 
configuration  (0,...,0,i  , . . . ,  i-) .  It  has  been  recently  shown  (unpublished) 
that  Fr(k,M,D)  is  monotonically  decreasing  in  r.  Thus 


(5.14) 


inf  P{CS|R.)  =  F.  (k,M,D) 


For  R3,  the  number  of  observations  (n)  is  a  random  variable.  Exact 
and  asymptotic  expressions  for  E(n)  corresponding  to  the  configuration 
♦l  =...=  ^  are  written  down  using  earlier  available  results.  Specific 
results  have  been  obtained  for  the  special  case  k=2. 

For  selecting  the  cell  associated  with  p^j,  Nagel  (1970)  constructed  a 
symmetric  rule  based  on  N  observations,  which  yields  a  minimum  of  PCS  when 
the  cell-probabilities  are  equal  and  which  maximized  PCS  for  the  configura¬ 
tion  (0, ... ,6,0+6)  where  6  >  0  and  k6  +  6  =  1.  His  rule  R^  is  a  randomized 
rule  which  selects  the  cell  with  observation  x^  with  probability  p^  where 


(5. IS) 


1  if  x.  >  d 

l 


pi  =  J  p  if  x.  =  d 


0  if  x.  <  d  , 

l 


where  d  >  0  is  determined  from 


(5.16) 

and 

(5.17) 


(r)N  I  (J)  (k-l)N_i  <  P* 

K  i=d+l  1 


(i)N  l  (•)  (k-l)N-1  >  P*  . 
i=d 
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It  follows  from  above  that 


(S.18) 


M  N 

-  y  (J)  (k-1) 

i*d+l 


N-i 


(k-1) 


N-d 


II.  Multivariate  Normal  Case. 


Selection  problems  for  multivariate  normal  populations  have  been 

investigated  when  the  populations  are  ranked  in  terms  of  (i)  generalized 

variance  (ii)  distance  function  and  (iii)  multiple  correlation  coefficient. 

In  the  following  discussion  of  these  investigations,  we  assume  that 

are  independent  p-variate  normal  populations,  where  ik  has  mean  vector  p^ 

and  covariance  matrix  Z. (i  =  1,2, . . .  ,k) .  Let  x.  . ,  j  =  1 ,2, . . .  ,n,  be  a  sample 

1  1J  j  n 

of  size  n  of  vector  observations  from  it.  and  S.  =  — r  7  (x.  -  x.)(x.  -  x. ) '  . 

l  l  n-1  L.  v  ia  i/ v  io  l 

a=l 

(a)  Selection  in  terms  of  Generalized  Variance,  |l|.  In  this  case  p^  and 
5^  are  unknown.  For  selecting  a  subset  containing  the  population  associated 
with  the  smallest  |l^|,  Gnanadesikan  and  Gupta  (1970)  studied  the  following 
rule  R,  based  on  the  sample  covariance  matrices  S^ ,  i  =  l,..,n. 

R:  Select  the  population  ik  iff 

(5.19)  |S. I  <  -  )S|  .  , 

where  |S|  .  =  min(|S. I , Is.  I)  and  0  <  c  <  1.  It  has  been  established  that 

1  'min  1  1 1 ’  1  k 1  — 

(5.20)  inf  P{CS | R>  =  P(Y  <  -  Y.;  j  =  2,...,k)  , 

fl  1  c  3 

where  Y^(i  =  l,...,k)  are  k  independent  random  variables,  each  being  the 
product  of  p  independent  factors,  the  rth  factor  being  distributed  as  a 
chi-square  variable  with  (n-r)  degrees  of  freedom. 


58 


The  exact  distribution  of  Y^  is  unknown  except  when  p»2.  In  the 

case  of  p=2,  we  get  inf  P{CS|R}  *  P(Z.  <_  — —  2,;  j  ■  2,...,k}  ,  where 

n  Jc  J 

,  i  ■  are  k  independent  random  variables  each  having  a  chi- 

square  distribution  with  2(n-2)  degrees  of  freedom.  If,  further  k=2,  then 
1/2 

c  is  the  100 (1-P* )  percentage  point  of  an  F  variable  with  (2n-4,  2n-4) 
degrees  of  freedom. 

When  p  >  2,  one  can  use  Hoel's  approximation  for  the  distribution 

2 

of  Y^  in  (5.20)  or  use  the  approximation  of  log  x  by  the  normal  distribution. 

Some  study  of  these  approximations  were  made  by  Gnanadesikan  and  Gupta. 

Further,  the  performance  of  the  procedure  R  was  studied  in  terms  of 
risk  functions  using  three  different  loss  functions.  If  the  ordered 

Ul[k]»  ^e 

different  loss  functions  that  were  considered  for  the  loss  incurred  by 
including  the  population  whose  generalized  variance  is  4*  are: 

(i)  4(1.)  =  UIj/IiIjh  -  i.o  . 

k  fk+1) 

(ii)  4(4)  =  (Rank  of  the  population  4)/— 4 — 1  .  where  the  ranks  increase 
along  with  the  generalized  variance,  and, 

(iii)  4(4)  =  k  .  where  S  is  the  number  of  populations  included  in  the  subset. 

The  computations  of  the  risk  functions  associated  with  the  above  loss  functions , for  p=2, 

k=2(l)5,  lEl [i]  =  a2*’2*  when  a  =  1 .2  (.2)2. 0(. 5)5.0,  n  =  3(1)7  and 

P*  =  .75,  indicate  that  £(4)  and  E(L3)  are  sensitive  to  changes  in  the  values 

of  the  parameters  and  are  decreasing  functions  of  a  and  n.  In  the  case  of 

£(4),  it  increases  in  the  range  of  values  of  a  considered  when  n=3  and,  for 


generalized  variances  are  denoted  by  Izj,..  |z| 


[2]  —  "  ‘  — 
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other  values  of  n,  it  increases  up  to  a  certain  point  and  then  decreases  as 
a  increases.  This  lack  of  monotonicity  in  the  behavior  of  E(Lj),  as  the 
'best'  population  moves  further  away  from  the  other  populations,  and  the 
difficulty  of  its  interpretation  render  EtLj)  less  suitable  than  L2  and 
L^.  Comparing  and  L^,  due  to  the  ease  of  interpretation,  would  be 

more  appropriate  as  the  criterion  of  performance  of  the  procedure  R.  Finally, 
the  procedure  R  is  shown  to  be  monotone. 

Suppose  we  consider  a  partition  of  the  p  variables  into  two  sets  of 
and  q2  components,  respectively,  where  q^  ♦  q2  *  p.  The  corresponding 
partition  of  E^  is  denoted  by 


Here  we  assume  that  E^ ,  zj^,  E^  are  all  positive  definite.  We  are 
interested  in  selecting  a  subset  containing  the  population  associated  with 
the  smallest  IzJ/lzj^l  =  Iz^-  I  =  °i*  say*  In  other 

words,  if  we  consider  for  each  population  the  conditional  distribution  of 
the  q2  set  when  the  qj  set  is  fixed,  then  our  criterion  of  ranking  is  the 
conditional  generalized  variance.  If  the  observations  are  taken  on  the 
variables  of  the  q2  set,  holding  the  variables  of  the  q1  set  fixed,  then 
the  problem  reduces  to  selection  in  terms  of  the  generalized  variance  for 
the  conditional  normal  distributions  with  dimensionality  q2,  a  problem  solved 
by  Gnanadesikan  and  Gupta  (1970) .  Let  us  consider  the  unconditional  case  in 
which  all  the  p  variables  are  random  and  observations  are  taken  on  all  of  them 


5 

* 

i 

« 

s 

i 

4 

i 

i 


j 


riw*iaw!u.i> 


and  use  as  the  criterion  for  ranking.  Then  consider  the  partition  of  the 
sample  covariance  matrix  denoted  by 


We  compute  ■  (S^  ”  ^21^  ^11^  ^12^  I'  Gupta  and  P&nchapakesan  (1969a) 

studied  the  following  rule  R'  for  selecting  the  population  with  smallest 

R'  :  Select  iff 

(5.21)  si  i  FT  “in(s1>...,sk) 

where  0  <  c'  *  c' (k,P*,n,q. ,q2)  <  1  is  chosen  to  satisfy  the  P*-condition. 

It  is  shown  that 

(5.22)  inf  P(CS|R)  =  /“  [ 1  -  G(c'x)]k-1dG(x)  , 

where  G(x)  is  the  cdf  of  a  random  variable  which  is  the  product  of  q2 
2 

independent  x  variables  with  degrees  of  freedom  n-q^-1,  n-q^-2, . . . ,n-q1-q2> 
respectively. 

(b)  Selection  in  terms  of  distance  function. 

Suppose  the  mean  vectors  vk  are  unknown  and  Z^  *  Z (known)  for  all  i. 

i  _i 

Let  X.  =  p.  Z  u.,  the  Mahalanobis  distance  function  of  the  population  ^ 

'  _  i 

from  the  origin.  Let  y..  =  x..  Z  x  •  j  -  1 . . . . ,n;  i  =  1, . . . ,k .  Then 
s  7ij  ij  ij  J 
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v  2 

l  y. .  has  the  non-central  x  distribution  with  np  degrees  of  freedom 


j-1 


ij 


and  non-centrality  parameter  X^  ■  nX„  We  are  interested  in  selecting  a 
subset  containing  the  population  with  the  largest  X^.  Gupta  (1966b)  proposed 
and  studied  the  following  rule  R. 


R:  Select  the  population  ir  iff 


(5.23) 


Yi  1  c  max(y1,...,yk) 


where  0  <  c  “  c(k,n,pfP*)  <  1  is  determined  to  satisfy  the  P*-condition. 
The  probability  of  a  correct  selection  is  given  by 


(5.24) 


k-i 

P (cs | R }  -  /”  [  n  F 

0  j-i  m 


£)]  dF  (x)  , 

C  X  tM 


where  X'^  ^X'^j  X'  ^  are  the  ordered  X' 

denotes  the  distribution  function  of  a  non-central 
degrees  of  freedom  and  non-centrality  parameter  X' 
tically  increasing  in  X', 


values  and  , (x) 

x*  variable  with  np 

Since  { , }  is  stochas- 


(5 . 25)  inf  P(CS|R)  «=  inf  /“  F*;1  (~)  dF.  ,  (x)  . 

n  x  '>0  0  X  C  A 

Gupta  showed  that,  for  k*2,  the  integral  on  the  right  hand  side  of  (5.25) 
is  non-decreasing  in  X'  and  hence  the  infimun  takes  place  when  X'=0.  Thus, 
the  constant  c  satisfies  the  condition 


(5.26) 


C  G  (-)  dG  (x) 
'0  me  m 


P*  , 
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2 

where  Gm(x)  is  the  central  x  distribution  with  np  degrees  of  freedom. 

For  selecting  the  population  associated  with  X'^j,  *  similar  procedure 
was  studied,  namely, 

R' :  Select  ?r  iff 

(5.27)  yi  <  b  min(y1,...,yk)  , 

where  b  =  b(k,n,p,P*)  >  1  is  determined  so  as  to  satisfy  the  P*-cor.dition. 

In  this  case,  we  obtain 

(5.28)  inf  PtCSjR}  -  inf  J"  [1-F  ,  £)]k_1dF  (x)  . 

n  X*>0 

The  integral  is  shown  to  be  monotonically  increasing  in  X*  for  k  -  2. 

For  the  procedures  R  and  R'  defined  above  Gupta  and  Studden  (1970) 
established  the  monotonicity  of  the  integrals  appearing  in  (5.25)  and  (5.28) 
w.r.t.  X'  in  the  general  case  k  >_  2.  They  proved  the  following  theorem  for 
that  purpose. 

Theorem  5,3.  Let  g^ (x) ,  j  «=  0,1,2...  be  a  sequence  of  density  functions  on 
the  interval  [0,®>)  and  define 

»  -X  j 

(5.29)  fx(x)  ■=  l  g.(x),x>0. 

A  j=0  3 

For  a  fixed  integer  k  2  and  c  >  1,  let 

(5.30)  I(X)  =  /"  Fk_1  (cx)  dFx(x) 
and 


(5.31) 


J(X)  =  f”  (l-Fx(|)]k_1dFx(x)  . 
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Let  A  denote  the  condition  that,  for  each  A  0 

<s'32>  j.  TTTCTjT  l«W“>  - 

i*0 

-  C  gA(cx)  {G  i+1(x)  -  Gt_.(x)}]  >  0  . 

Then,  the  functions  1(A)  and  J(A)  are  non-decreasing  in  A  provided  that  the 
the  condition  A  holds.  Further,  both  the  functions  are  strictly  increasing 
in  A  if  the  condition  A  holds  with  strict  inequality  for  some  integer  A. 

As  pointed  out  earlier,  the  condition  (5.32)  can  be  obtained  from  the 
condition  (3.9).  In  fact,  Gupta  and  Studden  verify  in  the  cases  of  non-central 
chi-square  and  non-central  F  distributions  a  condition  which  is  stronger  than 
(5.32).  Th i f  stronger  condition  states  that  the  sum  of  the  terms  in  the  left 
hand  side  of  (5.32)  corresponding  to  i  and  A-i,  i  «  0,...,[A/2],  is  positive 
and  this  is  same  as  the  condition  (3.12)  for  proper  choices  of  h(x)  and  the 
weight  functions. 

To  be  precise,  Gupta  and  Studden  considered  the  case  where  I ^  are  all 
not  necessarily  equal  but  known.  With  a  slight  modification,  namely, 

•  _  i 

yij  =  xij  ^i  xi j ’  we  ^ave  essentially  Gupta's  procedures  R  and  R'.  They 
also  studied  procedures  when  EVs  are  different  but  all  unknown.  In  this 
case,  let  z^  =  x!  jL.  Then,  for  the  selection  of  the  population  with 
the  largest  and  smallest  distance  functions,  the  procedures  studied  are, 
respectively, 

R,  :  Select  u.  iff 
1  l 

(S .  33)  cz^^  ^  max(Zj , . . .  ,zk) 


and 
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RJ  :  Select  it.  iff 
1  1 

(S. 34)  zi  <_  b  minUj,..  ..i^ 

where  c  =  c(k,p,n,P*)  >  1  and  b  -  b(k,p,n,P*)  >  1  are  determined  so  that 
P*-condition  is  satisfied.  It  is  known  that  2^  is  essentially  distributed 
as  a  non-central  F  variable,  whose  density  is 
Theorem  5.3  applies  in  this  case.  It  is  shown 
A  is  satisfied.  Thus  we  obtain  the  equations 
and  d,  namely, 

(S .  35)  J"  Fk  l  (cx)  dF  (x)  =  P* 

J0  P.n-p  P»n-pl 

and 


of  ttfl  orm  (S.29).  Hence 


that 


sufficient  condition 


to  det^lune  the  constants  c 


(5.36)  f"  [  1-F  (x J b) 3 k ~ 1  dF  (x)  =  P*  . 

JQ  1  p,n-pv  1  1  p,n-pv 

Alam  and  Rizvi  (1966)  have  also  considered  the  problem  of  selection  in 

terms  of  distance  function.  For  E^  unknown,  their  procedure  is  same  as  that 

of  Gupta  and  Studden  (which  was  originally  studied  in  a  technical  report 

issued  in  1965)  but  the  monotonicity  of  the  integral  involved  is  established 

rather  directly  and  not  by  obtaining  a  sufficient  condition  applicable  to  a 

class  of  distributions  including  non-central  chi-square  and  non-central  F 

distributions.  Further,  in  the  case  of  E^  known,  Alam  and  Rizvi  use  the 

procedure  defined  by  (5.  33)  with  E^  in  the  place  of  ;  in  other  words, 

- '  -1  - 

using  the  statistics  -  x^  x^.  This  is  different  from  the  procedure  of 
Gupta  (1966b)  and  Gupta  and  Studden  (1970),  who  have  observed  the  undesirability 
of  using  x!  E?1  x^  in  the  sense  that  the  constant  evaluated  subject  to  the 
P*-condition  is  independent  of  n. 

1 
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(c)  Selection  in  tern  multiple  correlation  coefficient. 

Let  =  p^  2  p  be  the  multiple  correlation  coefficient  between  the 
first  variable  and  the  rest  in  the  population  ik.  Let  0  <_  1 

be  the  ordered  values  of  the  p^.  Gupta  and  Panchapakesan  (1969a)  investigated 
the  problem  of  selecting  a  ..ubset  containing  the  population  associated  with 
Pjkj  (or  Pjjj)*  Denote  the  sample  multiple  correlation  coefficients  by 
R.  £  R,Cii  .  Two  cases  arise: 

l  i .  t . .  .  p 

(i'  The  case  in  which  x. x.  are  fixed,  called  the  conditional  case; 

u  ip 

(ii)  The  case  i..  which  x.  are  random,  called  the  unconditional  case. 

The  following  rule  ft  has  been  investigated  by  Gupta  and  Panchapakesan  for 
the  selection  of  p.^. 

ft:  Select  it.  iff 
x 


(5.37) 


R*2  >_  c  max  (R*2,... ,R*2) 


2  2  2 

where  Rt  =  R^/(l-R^),  i  =  l,...,k,  and  0  <  c  =  c(k,P*,p,n)  <  1  is  chosen 
subject  to  the  P*-condition.  In  the  formal  statement  of  ft  we  do  not  make  the 

distinction  between  the  conditional  and  unconditional  cases. 

2  2 
Letting  Ju  =  p^,  i  =  l,...,k,  the  distribution  of  Rt  is  given  by 

‘s-38> 

in  the  unconditional  case  and  by 


»>  -raA  .j 

(5.39)  ux(.)  •  »2(,,j)>2.(.) 

in  the  conditional  case,  where 


t 

r 

r 


*  * 
t 

r 


V:  \  .?. 

wiraallaiMacriMl 
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(5.40)  q  =  (p-l)/2  ,  m  =  (n-p)/2 

and  f  (x)  denotes  the  density  of  the  F-distribution  with  r  and  s  degrees 

of  freedom.  It  is  easy  to  show  that  u^(x)  has  a  monotone  likelihood  ratio 

2 

in  x  and  hence  the  distribution  of  R*  is  stochastically  increasing  in  X. 
Thus  we  obtain 

(5.41)  inf  P(CSjft)  =  inf  /“'J^'V/c)  dU.  (x)  , 

n  X  U  A  A 

where  U, (x)  is  the  cdf  corresponding  to  u^(x). 

In  the  conditional  case,  the  condition  A  of  Theorem  5.3  is  satisfied 
and  hence  the  infimum  takes  place  for  X  =  0.  For  the  unconditional  case  the 
same  result  is  shown  by  proving  the  following  theorem. 


Theorem  5.4.  Let  (x) ,  j*Q,l,2,...  be  a  sequence  of  density  functions  on 
the  interval  [0,»)  and  define 


00  j 

(5.42)  fx(x)  =  ^  ( 1  —  X ) q  g.(x),  x  >  0,  0  <  X 


<  1. 


For  a  fixed  integer  k  >_  2  ?nd  0  <  c  <  1 ,  let  I(X)  and  J(X)  be  defined  as  in 
(5.30)  and  (5.31).  Let  B  denote  the  condition  that,  for  each  integer  i  >_  0 


(5.43) 


i  (q) . (q)  . 

E  -f!  \i~:  )~f  t(q+i)(Gi+1(x|c)  -  Gi(x|c)gJl _.(x) 
-c'1(q+t-i)gi(x|c)(GJl_i ^(x)  -  Ga_.(x)}1  _^0 


where  (q)_ 
Then, 


■-  q(q+l) . . . (q+s-1)  and  G.(x)  is  the  cdf  corresponding  to  g.  (x) . 

J  * 

I(X)  and  J (X)  are  non-decreasing  in  X  if  condition  B  hclds  and  the 


two  functions  arc  strictly  increasing  in  X  if  strict  inequality  holds  in  condition 
B  for  some  integer  £. 


47 


It  can  be  easily  verified  that  the  condition  B  is  satisfied  in  the 
unconditional  case.  Thus,  in  either  case,  we  get 


(5.44)  inf  PCCS|«)  -  |  f£2>  (,/c)  dF2q  2>(x)  . 


where  2nj(x)  *s  t*'e  cdf  corresponding  to  fj^  Since  t*ie  distri- 

2 

bution  of  R*  when  X  =  0  is  the  sane  in  both  conditional  and  unconditional 
cases,  the  constant  c  used  in  the  procedure  is  the  same  and  is  given  by 


(5.45) 


J 

0 


F2q!2m  (X/C) 


dF2q,2mW 


When  q  and  m  are  integers,  i.e.,  p  and  n  are  odd,  we  can  use 
series  expansion  for  2n(x)  obtain  formulae  for  computing  c  for 

specified  values  of  q,m  and  P*.  The  final  result  is: 


(5.46)  P*  = - - 

r(q)r(m)(l-c)m 

qk-1  (k-l)(m-l)  . 

lr  1 

x  l  l  (-l)a(qK'1)a(k-l,j)(T^~)  K(c,m,q,a,j ) 
a=0  j«0  a  1  '  C 


where  a(r,j)  and  K(c,m,q,a,j)  are  given  by  the  following  recurrence  relations 


(5.47) 


e(l.j)  * 


q(q+l)...(q+j-l). 


j  =  0 

1  <_  j  <_  m-1  . 


and  for  r  >  1 
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1  j  *  0 

(5.48)  a(r,j)  =  <  min(m-l.j) 

l  a(l,s)a(r-l ,j -s)  ,  1  <.  j  <_  r(m-l)  . 

s=max(j- (r-1) (m-1) ,0) 


Since  1  ~  2m(xd)  =  P2m  ^or  3  8iyen  set  4,m,k  and  P* ,  the 

constant  d  of  the  procedure  ft’  is  the  same  as  the  constant  c  of  the  procedure 
W  with  q  ami  m  interchanged.  It  can  be  shown  that  the  procedures  R  and  R' 
have  the  monoton i city  property. 
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Govindarajulu  and  Gore  (1971)  have  discussed  selection  from  bivariate  normal 

populations  in  terms  of  their  product -moment  correlation  coefficient.  If  pi 

denotes  the  correlation  coefficient  in  the  population  TK(i  *  i,...,k),  then 

to  select  a  subset  containing  the  population  with  Govindarajulu  and  Gore 

have  investigated  the  following  two  rules  R  and  R  2  based  on  the  sample 

product  oment  correlation  coefficients  r.  and  the  transforms  s.  »  i  loc  - - 

i  l  2  6  1-r. 

l 

(i  ■  l,...,k),  respectively.  R1  selects  iff 

(^•51)  r.  >_  max  r.  -  h 

1<J  <k  3 

and  R2  selects  iu  iff 

(5.52)  s.  max  Sj  -  h 

l<j<k  } 

where  h  >  0  is  chosen  so  as  to  satisfy  the  P*-condition.  It  has  been  shown 
that,  for  large  n,  h  satisfies 

(5.53)  P(U.  <^h  i  =  l,...,k-l)  =  P*  , 

where  the  IJL  have  a  multivariate  normal  distribution  with  E(lh)  =  0, 

Vfup  =  1,  E(IL  U^)  =  1/2,  i  ^  j .  If  we  are  interested  in  ranking  |pj,  then 
the  procedure  suggested  is  to  select  tt.  iff  jr^  >  max  |r^  |  -  h,  where  large 
sample  solution  of  h  is  given  by  (5.53).  It  is  to  be  noted  that  ranking  in 
terms  of  |p^|  is  really  a  special  case  of  ranking  in  terms  of  multiple  correla¬ 
tion  coefficient  investigated  by  Gupta  and  Panchapakesan  (1969a). 


6.  Distribution-Free  Procedures. 


SO 


In  this  section  we  discuss  a  non -parametric  procedure  for  selection  in 
terms  of  quantiles  of  a  given  order  based  on  order  statistics  and  some  pro¬ 
cedures  based  on  ranks  and  paired  comparisons. 

(a)  Selection  in  terms  of  quantiles. 

Suppose  tk  (i*l,...,k)  is  a  continuous  population  with  distribution 
function  F^  whose  form  is  not  known.  It  is  assumed  xo(F^)  is  the  unique 
a-quantile  of  the  distribution  F^  Let  F^j  denote  the  distribution  with  the 
its  smallest  a-quantile.  The  problem  of  selecting  a  subset  containing  the 
population  with  the  largest  a-quantile  has  been  studied  by  Rizvi  and  Sobel 
(1967) .  Their  formulation  of  the  problem  requires  the  P*-condition  to  be  met 
for  the  set  of  all  k-tuples  (Fj,...^)  for  which  F^-j  is  stochastically 
larger  than  any  other  population. 

For  0  <  a  <  1,  we  take  n  sufficiently  large  so  that  1  <_  (n+l)a  <_  n  and 
define  a  positive  integer  r  by  the  inequalities  r  <  (n+l)a  <  r+1.  Then  the 
procedure  R1  =  R^fc)  proposed  by  Rizvi  and  Sobel  is  defined  in  terms  of  a 
positive  integer  c(l  <_  c  <_ r-1)  and  the  order  statistics  Y^  .  where  ^ 
denotes  the  jth  order  statistic  from  the  population  based  on  n  independent 
observations. 


R^:  Select  F^  iff 


Y  .  >  max  Y 
r'1  -  l<j<k  r'c’J 


where  c  is  the  smallest  integer  with  1  <  c  <  r-1  for  which  inf  P{CS|Rj)  >_  P*. 

fil 

For  any  a  and  k,  it  may  happen  that  a  value  of  c  <_  r-1  does  not  exist 

for  some  pairs  (n,P*).  However,  if  P*  <  P.  *  (")  £  (-1)*  then 

r  i=0  1  r 

a  value  of  c  <  r-1  exists  and  is  unique.  The  value  of  c  has  to  satisfy 
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1  k -1 

(6.2)  /  c;  *(u)  dG  (u)  >  P*  , 

0 

where  Gr(u)  *  Iu(r,  n-r+1)  is  the  standard  incomplete  beta  function. 

It  has  also  been  shown  that  E(SjR^)  is  maximized  in  when  the  popu¬ 
lations  are  identical.  Further,  we  let  denote  the  configuration  with 

9fk]  "  6[i]  =  Misl» •  •  •  ,k-l)  under  the  assumption  that  F^(x)  =  FCx-e^j). 

Let  Dj(e)  be  the  approximate  sample  size  (obtained  by  using  asymptotic  theory 
of  quantiles)  required  to  satisfy 

(6.3)  E(S|R1,  P  }  <  1  +  e  . 

Similarly  n2(e)  denotes  the  sample  size  required  to  satisfy  (6.3)  when  we 

use  the  procedure  R2  based  on  sample  means  (i=l , . . . ,k) ,  which  selects  the 

population  corresponding  to  x.  iff  x.  >  max  x.  -  6  where  6  >  0  is  chosen 

1  lfj^k  3 

to  satisfy  the  P*-condition.  Then  the  asymptotic  relative  efficiency  of  R1 
relative  to  R2  is  defined  by 

(6.4)  ARE (Rj ,R2)  =  lim  [n2(e)/n1(c)]  . 

e-*0 

For  a  ■  j  and  no-mal  shift  alternatives  with  o  «  1,  ARE(Rj,R2)  =  2/tt.  Again, 
for  a  =  j  and  two-sided  exponential  shift  alternatives  with  continuous  symme¬ 
tric  densities  about  the  median  value  6^,  ARE(R^,R2)  =  2. 

Desu  and  Sobel  (1971)  have  discussed  non-parametric  procedures  for  quantile 
selection  under  a  modified  goal  of  selecting  a  fixed-size  subset  which  is  described 
elsewhere  in  this  paper.  Barlow  and  Gupta  (1969)  investigated  the  quantile  selec¬ 
tion  in  certain  restricted  class  of  distributions  and  this  is  also  discussed 


elsewhere. 
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In  the  paired  comparison  approach,  we  compare  all  the  k(k-l)/2 


possible  pairs  of  the  populations  ir^,.  we  n  replications  of 

each  comparison.  For  i,  j*l,...,k;  ijtj  and  y  »  l,...,n,  let 


(6.5) 


where  means  that  is  preferred  to  it ^ . 

It  is  assumed  that  the  ties  are  not  possible.  Let 


(6.6)  P{X.  .  =1}  =  $..  and  P{X..  =0}  =  <{>..  *  1  ■  4>  •  • 

iJY  ij  iJY  Ji  J-3 


The  score  a^  of  the  population  is  defined  by 


(6.7) 


il  it 

a.  *  y  a.  *  T  T  X 
1  Y=1  1Y  Y“1  jii 


where  a.  denotes  the  (partial)  score  of  v-  in  the  yth  replication.  It 
1Y  k  1  k 

is  easy  to  see  that  l  a.  =  k(k-l)/2  and  £  a.  =  nk(k-l)/2. 

i=l  lT  i=l  1 

It  is  assumed  that  the  preference  probabilities  satisfy  a  linear  model. 

To  be  specific,  let  6^  be  the  true  "merit"  of  when  judged  on  some 
characteristic.  Let  y^(i*l, . . . ,k)  be  the  observed  merit  of  on  which  the 
comparisons  are  based.  Suppose  that  tk  -*■  if  y^  >  y.  and  it^  •+  other¬ 
wise.  Then  the  preference  probabilities  are  said  to  satisfy  a  linear  model 

if  ^  =  P{y^  -  y^  >0}  for  all  i  and  j  can  be  expressed  as  H(e^-6j), 
where  H(x)  is  a  distribution  function  on  the  real  line  with  H(-x)  •  1  -  H(x). 
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Under  the  above  linear  model,  Trawinski  and  David  (1963)  propose^  the  follow¬ 
ing  rule  R  based  on  the  score  for  selecting  a  subset  containing  the  popu¬ 

lation  with  the  largest  9^. 


R:  Select 


IT  . 

1 


iff  >_ 


max 

by<k 


a. 

1 


where  v  =  v(k,n,P*)  is  a  non-negative  integer  to  be  chosen  so  as  to  satisfy 
the  P*-condition.  Under  the  linear  model,  it  has  been  shown  that  the  least 
favorable  configuration  is  given  by  ifn.  =  1/2  for  all  i  and  j(i^j)  and 
is  denoted  by  C(l/2).  Thus  v  is  the  smallest  integer  for  which 


(6.8)  P(CS| Rj :  C (1/2)  }  >_  P*  . 

Trawinski  (1969)  obtains  an  approximation  for  E{ S | R >  in  terms  of  (k-1) 
variate  normal  distributions  and  transforms  these  into  more  numerically  tractable 
integrals.  His  approximation  is  obtained  under  a  slippage  configuraUo';  which 
is  specified  by 


(6.9) 


$i j  =  1/2  for  i,  j  =  1 , . . .  ,k-l ;  , 

^  *  4>  for  i  =  l,...,k-l  . 


and  is  valid  whenever 


<  j  +  ~  (k/(k+l) }1/2. 


(c)  Procedures  based  on  ranks. 

Let  Xi j ,  j  =  l,...,n^,  be  independent  observations  from  population 
in  (i  *  l,...,k)  whose  associated  distribution  function  is  (x) .  The 

l 

functional  forms  of  is  not  known  but  it  is  assumed  that  { F^ )  is  a 

stochastically  increasing  family.  All  the  observations  are  pooled  and 

denotes  the  rank  of  X. .  in  the  combined  sample  of  N  =  n,  ♦  _  +  n, 

11  y  lk 
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observations.  Let  Z (1)  _^_Z(2)  <_  ...  <^Z(N)  denote  an  ordered  sample  of  site  N 
from  a  continuous  distribution  G  such  that  -  «  <  afr)  =  E„(Z(r))  <  <* 

(r  =  1 , . . . ,N)  .  With  each  of  th.:  observations  associate  the  numbor  a(R^) 

and  define 

n. 

-1  r1 

(6.10)  H  *  n/  I  a(R.  .),  i  -  l,...,k  . 

1  1  i=l  l} 

Using  the  quantities  Ih,  Gupta  and  McDonald  (1970)  defined  the  following  three 
classes  of  procedures  for  selecting  a  subset  containing  the  population  with 
the  largest  6.: 


R^G): 

Select 

n .  iff 

l 

IL  +  d 

>_  max 

d  >_  0 

(6.11)  R2(G): 

Select 

ir .  iff 
i 

cH. 

l 

>  max 

(Hj  , . . .  »Hjc)  > 

c  >_  1 

R3(G): 

Select 

w.  iff 

l 

H. 

l 

1  D 

» 

-  CO  < 

All  the  three  classes  of  rules  are  equivalent  if  R  »  2.  The  following 
theorem  is  established  regarding  the  infimum  of  the  probability  of  a  correct 
selection . 

Theorem  6.1.  For  the  procedures  Rj(G),  R2(G)  and  Kj(G), 

(6.12)  inf  P(CS|R. (G) }  =  inf  P{CS|R.(G)},  i  *  1,2,3 


where  U  is  the  space  of  all  configurations  of  6  *  (0^,...,©^)  and 

=  («  f  n:  : b  i !  *  *IM  ’  ’  Further»  For  (G)  , 

(6.13)  inf  P{CS|R3(G)}  =  inf  P{CS | R^ CG) )  , 

«0 


where 


■j 


I'D 


)  . 


It  should  be  noted  that  a  result  of  the  type  (6.13)  is  not  true  in  general 
for  Rj(G)  and  R2(G).  The  procedures  Rj(G)  (and  their  randomized  analogs) 
have  been  suggested  by  Bartlett  and  Govindarajulu  (1968)  for  continuous  distri¬ 
butions  differing  by  a  location  parameter.  The  procedures  of  the  type  R2(G) 
have  been  proposed  by  Blumenthal  and  Patterson  (1969).  For  all  these  procedures 
a  result  of  the  type  (6.13)  is  not  true  in  general.  Rizvi  and  Woodworth  (1970) 
have  given  counterexamples  to  show  that  the  least  favorable  configuration  is 
not  always  given  by  the  identical  distributions  case. 

In  the  cases  of  Rj(G)  and  R0(G),  Gupta  and  McDonald  (1970)  have  obtained 
bounds  on  the  probability  of  a  correct  selection.  It  has  been  shown  that 

(6.14)  inf  P{H.,  .  >_v)  <_  inf  P(CS|R.(G)}  <_  inf  P(Hf..  >_  u} 

u  1  J  o  n  1  J 

and 

(6.15)  inf  P(Hr.  .  >  v'l  <  inf  P(CS|R,(G)>  £"inf  P(H  >  u')  , 

n  n  o  w 

where  is  the  statistic  H.  associated  with  the  distribution  F 

£*>  1  [k] 

and,  u'  and  v'  are  given  by 

(6.16)  u'  *=  u  (d.k.ii  =  n*1  A[1  ♦  c(k-l)'1] 


and 

-1  N 

(6.17)  v'  =  v  (d,k,n)  =  (nc)  £  a(r)  , 

r=»N-n+l 

N 

where  A  -  £  a(r). 

r=l 


For  the  particular  case  where  a(r)  =  r,  nH^  =  T^ ,  where  the  T^  are 
the  rank-sum  statistics.  In  this  case  we  denote  R^(G)  by  .  For  this 
special  case,  we  obtain 


(6.18) 


S6 


inf  P(CS | R. }  >_  P{U  <  nd)  , 

a 

where  U  is  the  Mann-Nhitney  statistic  associated  with  samples  of  sited  n 
and  (k-l)n  taken  from  two  identically  distributed  populations.  A  similar 
result  is  true  for 

As  regards  R^,  we  observe  that  Rj  may  not  always  select  a  non-empty 
subset.  A  sufficient  condition  for  selection  of  a  non-empty  subset  is 
that  P*  be  sufficiently  large  so  that  D  <_A/N.  For  large  n,  this  sufficient 
condition  holds  if  P*  >  y.  The  constant  D  =  D(k,n,P*)  for  thj  rule  R^  is 
found  such  that 

(6.19)  P(U  <  r2(k  -  y)  -  n(D  -  y)}  >  P*  . 

Asymptotic  expressions  were  obtained  for  E (3 } Rj^  and  E(S|Rj). 

Assuming  n..  =  n,  for  large  n,  the  distribution  of  T'  =  (T^,...,!^) 

t 

is  approximately  multivariate  normal  with  mean  vector  wT  =  (Uj, . . . ,1^) 
and  variance-covariance  matrix  Let  A  be  a  (k-1)  x  k  matrix  given  by 

f  \  0  0  ...  0  -1 

[  0  1  0  ...  0  -1 

(6.20)  A  =  I  . 

I 

I 

\o  0  0  ...  1  -1 

Define  Wv  =  A  T,  where  A  is  the  (k-1)  x  k  matrix  obtained  from  matrix  A 

V—  V 

by  moving  column  j  to  column  j+1,  j  =  v,  v+l,...,k-l  and  replacing  column 
v  by  column  k.  Let  =  A^,  and  *  Av^T^v*  Then  we  have  the  following 


theorem. 


Theorem  6.2.  If  £  is  non-singular  for  v  *  then 


k  d 


(6.21)  E  { S  |  R  j )  ;  l  Kv  /  ...  /  exp  [-(Wv-^)'  (Wv-u^)/2]  n  dW* 

v=l  -™  -  "»  i=  1 


where  K  =  [(2*)]  1  |Y  |  ]  2.  For  R 


(6.22) 


E{S|R,1  :  I  *[(uv-D)/ov!  . 


Let  it ^  and  tt,  be  two  normal  populations  with  means  0  and  6(>_0) 
respectively  and  a  common  unit  variance.  The  asymptotic  relative  efficiency 
of  Rj  (which  is  equivalent  to  R2  and  in  the  case  of  two  populations) 

relative  the  rule  R  based  on  sample  means  (see  Section  2  )  is  given  by 


(6.23) 


ARE  (R1,R;6)  »  { [2<t(2-1/26)  -  1]/26B(6)}2 


where 


(6.24) 


B2(9)  =  /  4>2(x+6)  *(x)  dx  -  02(2"1/23)  . 


We  see  that  lira  ARE  (R.  ,R;9)  =  3/ir  . 

e»o  1 

-x/e. 

In  the  case  of  two  exponential  distributions  F.  (x)  =  1  -  e  (x  >  0) , 

G . 

1  , 

where  9^  =  1  and  e2  "  0  —  1»  a  similar  comparison  of  R2  and  the  rule  R 
by  Gupta  (1963)  for  gamma  populations  yields 


(6. 25) 


ARE  (R2,R’;e)  =  [(e-])/4(e+i)  Bj(e)  log  9], 


where 


(6.26) 


B2(e)  =  i-2(i+9)~1  ♦  (2e+i)-1  ♦  e(2+e)-1  -2e2  (l+e)"2. 


In  this  case  lim  ARE  (R,,R';0)  =  3/4. 
6ll 
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Some  i'  x  ;k  i 


i,  ii  ons  ot  tin  •  nufilurt's  Kj,  and  two  other  procedures 
woro  made  in  tt.  o.it  ..t  three  independent  exponential  populations  by  McDonald 
1 1 969 ** J  •  Procedures  similar  to  R^,  R.,  and  R^  were  studied  by  McDonald  (1969b) 


by  taking  T.  -  It 
i  ■  i 


,  ,  where  R. .  is  the  rank  of  X. .  among 

i  i)  ij  ij  6 


X,,,  x 


2j 


,Xk1 .  The  results  for  the  probability  of  a  correct  selection  are 


very  similar  to  those  discussed  above.  In  another  paper  McDonald  (1971)  has 
discussed  some  methods  of  approximating  the  constants  required  to  implement  the 


procedures  R^  ami  R, . 


( d )  Selection  it. _t_e rms  of  measures  of  association . 

bet  F.  f\,y)  denote  the  continuous  distribution  function  of 

(i  =  1 1  •  •  •  ,k) ,  a  set  of  k  bivariate  populations  and  denote  the  rank 

correlation  coefficient  for  population  it..  Let  (X.  .  ,Y.  .)»  j  =  l,...,n 

1  1  >  J  1  >  J 

and  i  =  be  n  independent  observations  from  each  of  these  populations. 

The  rank  R^  of  V\  ^  is  the  rank  of  its  associated  X  value  among 
X^i,...,X^  .  The  sample  rank-correlation  coefficient  is  given  by 

(6.27)  T.  =  ("r1  l  l  sign(R  -R  ),  i  =  1 . k  . 

j  <  j  ■  ' 


For  selecting  a  subset  containing  the  population  with  the  largest  t,  Govindarajulu 
and  Gore  (1971)  proposed  the  following  rule  R. 

R:  Select  n.  iff 

i 

(6.28)  T.  >  max  T.  -  h  . 

1 

Using  the  normality  of  the  "I\  and  assuming  a  knowledge  of  the  structure  of 
X.  ^  and  (which  implies  the  same  sign  for  the  correlation  between  any  two 

X's)  they  have  obtained  a  lower  bound  on  P{CS | R )  which  is  used  to  obtain  a 
suitable  value  of  h.  In  tne  absence  of  any  information  on  the  structure  of  X.. 
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and  Y^,  an  approximate  value  of  h  is  found  by  using  certain  consistent 

estimators  of  the  mean  and  the  variance  of  the  asymptotic  distribution  of  T\  . 

For  sufficiently  small  p.^  the  asymptotic  efficiency  of  the  procedure  R 

relative  to  the  procedure  P^  defined  by  (5.51)  based  on  product  moment 

2 

correlation  coefficient  is  found  to  be  9/n  when  the  unierlying  populations 
are  bivariate  normal.  For  the  p-variate  case  (p  >  2)  some  suitable  measures 
of  association  have  been  discussed  by  Govindarajuiu  and  Gore. 
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7 .  Sequential  Procedures 

Barron  and  Gupta  (1970)  investigated  a  non-eliminating  sequential  rule, 

for  selecting  from  k  independent  normal  populations  with  unknown  means 

0j . 6^  respectively  and  a  conmon  known  variance  o^,  a  subset  containing 

the  population  with  the  largest  0^.  The  rule  is  non-eliminating  in  the 

sense  that,  though  the  rule  selects  and  rejects  populations  at  v<- rious  stages, 

observations  are  taken  from  all  the  populations  until  the  final  decision  is 

made.  The  ordered  6i  are  denoted  by  <_  ...  1  and  if  is  assumed 

that  the  successive  differences  between  the  ordered  0.  are  known.  To  select 

i 

a  subset  containing  the  population  with  Procedure  J  investigated 

by  Barron  and  Gupta  is  described  below. 

We  take  one  observation  from  each  population  denoted  by  x, .Xj, . . . ,x^. 

For  each  population  n.  define 


if  x.  >  x 

i  —  max 


do 


(7.!)  Yu  - 

. 

where  xmax  =  max(xj • • 

...xR) 

(7.2) 

/“ 

-  00 

Then  we  draw  a  second 

set  of 

Yi2 (i=l ' ‘ '  '  *k)  similar  to 

0  otherwise  , 


(x+d)  d$(x)  =  P*  . 


of  observations  are  drawn,  we  have  Y^,  i=l»-.*»  k.  For  each  population  in  , 
we  define 


lm 


m 

=  l  Y.  . 
3-1 


(7.3) 


bl 


We  have  a  pair  of  sequences  of  real  nuabers  n=iv  -{{b  ),  (c  ))  such 

d,c  in  m 


that  for  all  »  >  1 , 


(i)  b  <  b  , ,  c  <  c  , 
v  m  —  m+1  m  —  m+1 


(ii)  b  <  c 
m  m 

(iii)  lim  b  =  ® 
m-H» 


(iv)  P{  fl  [b^  <  Sim  <  c^] )  =  0  for  all  i=l,...,  k. 
m=l 

The  sequential  selection  procedure  is  now  defined. 

/  :  Tag  population  i»l,...,k,  at  the  first  stage  m  >_  1  such  that 

S.  t  (a„,b  )  and  ">ark  it  "rejected"  if  3.  <  a  and  "accepted"  if 

im  m  m  im  —  m  r 

S.  >  b  .  Continue  sampling  from  all  k  populations  until  each  has  been 
im  —  m  ro 

tagged;  then  accept  those  marked  "accepted"  and  reject  those  marked  "rejected". 

The  following  observations  are  made  at  the  outset.  For  any 

m,  P{Y.  *  1}  =  p.  and  P{Y.  =  0}  =  1-p,  where 
im  ‘  i  im  r  i 

«  k 

(7.S)  Pi  =  /  (  n  *(x*d+(8 [ij-e^. j)/o)]d*(x),  i=l , . . .  ,k . 

Also  Y..,  Y. Y.  are  independent  and  S.  is  distributed  as  a  binomial 
ii  it  im  im 

random  variable  with  parameters  m  and  p^.  Let  denote  the  population 

with  mean  0,  ..  Define 

[r] 

ai (m)  =  ai(®»n^  c)  =  P  accepting  at  stage  ml^r^  c)>, 

*i (m)  =  ri (m,nb  c}  =  P  rejectin8  at  stage  c)}. 


*i(Vc>  -  l  ai(m)  and  ri(nb,c>  =  l 
m=l  m= 1 


ri  (m) 


When  there 


where  J(n^  £)  is  the  procedure  using  the  pair  of  sequences  n^ 

is  no  ambiquity,  «*(n)  is  used  for 

Definition  7.1.  Let  n  »((b  },  {c_ >)  and  n'-({bj.},{cj,})  be  two  pairs  of 
- -  -  -  ■—  -  mm  mm 

sequences  satisfying  (7.4).  The  sequences  {b^}  and  {b^}  are  said  to  be 

pairwise  ordered  iff  b  <  b'  for  all  ra  >  1.  This  relation  is  denoted  by 
r  m  —  m  — 

{b  }  «  fb' ). 
m  m 

Definition  7.2.  The  pair  n  is  ordered  w.r.t.  n'  (denoted  by  n  <  n')  iff 

(b  )  <  (b1 }  and  (c  }  <  (c* }. 
mm  mm 

Definition  7.3.  A  class  of  pairs  of  sequences  satisfying  (7.4)  is  said  to  be 

ordered  if  for  all  n.n'e  either  n  <  n'  or  n*  ■<  n  . 

The  following  two  theorems  have  been  established  by  Barron  and  Gupta. 

Theorem  7, 3.  If  n'  *<  n  then  a^n')  £  a^n)  and  r^n’)  <  r^n), 

i=l,2,...,k.  In  particular  P{CS |^(n* ) )  >.  P(CS|^(n) ). 

Theorem  7.2.  The  procedure  ^(n)  is  monotone  and  unbiased,  i.e.,  a^>_  a^ _1>  . .  .>.  a 
and  rk  <  r^  i=l , 2, .  . .  ,k-l . 

The  rest  of  the  investigation  of  the  procedure  «^(n)  has  been  accomplished 
by  using  the  following  class  C1  of  pairs  of  sequences.  Let  bm*  Sm-Yj,  cm=<5m+Y2 
where  6  is  a  rational  number  in  (0,1)  and  Yj.T^  are  positive  integers. 

For  Y^.72  fixed,  the  class  C 1  is  ordered  in  6.  For  this  class  it  is  shown 

that  condition  (iv)  of  (7.4)  holds.  If  we  set  =  S^m  -  5m,  for  any  n  e  Cj> 

the  events  [Sm-y^  <  <  Sm+y^],  im*^]  anc*  ^m  —  ^m”Yl^  are  ecluivalent 

t0  ("Yi  c  Rm  <  y2^’  -Rm  ~y2^  and  tRm  -  "Yl^  respectively.  By  taking  5=t/s 
where  t  and  s  are  relatively  prime  integers  with  t  <  s,  the  problem  of 
evaluating  the  various  probabilities  and  expectations  is  reduced  to  a  problem 
concerning  a  random  walk  on  the  line  where  the  state  space  is  all  points  of  the 

(Ns-Mt)/s  for  all  integers  M  >  N  >  0.  It  is  now  possible  to  relate  it  to 


form 
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a  random  walk  on  the  space  of  integers.  These  probabilities  and  expectations 
are  not  always  easy  to  compute  and  hence  some  approximations  and  bounds  were 
obtained.  We  summarize  the  results  below. 

Theorem  7.3.  For  the  sequential  procedure  «/(n)  where  n  ■ ( { 6ro- y > . { {6m+y}) 
and  6*t/s>0 


0 

if 

Pi 

<  t/s 

(7.6) 

lim  a. (6,y)  *  < 

1 

2 

if 

Pi 

=  t/s 

Y  -*OC 

1 

\ 

if 

Pi 

>  t/s 

where  p. 

*i 

is  given  by  (7.5) . 

Theorem  7.4.  Let  m.  =  the  smallest  m  >  1  such  that  n...  is  accepted  or 
-  !  _  (i)  v 

rejected  and  FT  *  Eior  e  *f(n)}.  Then,  for  the  sequential  procedure  ^(n) 
specified  in  Theorem  7.3, 

17.7)  M.  -  Y/|prt/s| 

provided  y  is  sufficiently  large  and  p^  /  t/s. 

Numerical  evaluations  made  for  6  -  .75,  y  =  3 f 1 ) 10  and  *  .4,  ,6,  .8,  .9 
indicate  that  the  approximations  are  good  for  all  the  y  values  chosen.  The 
approximation  in  the  case  of  the  probability  of  selecting  the  populations  using 
the  procedure  improves  as  Y  increases. 

There  still  remains  the  problem  of  choosing  the  two  constants  6  and  y  • 
Theorem  7.3  guarantees  that  for  any  choice  of  6  e  (p^  j,  p^),  there  exists  a 
Y  =  Y(6,e)  such  that  for  any  c  >  0, 

(i)  ak  (6,y)  >_  1-e  and 

(ii)  a^Cfi.Y)  I  e' 


(7  ?> 
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regardless  of  the  configuration  of  Pj  i  P2  I  •  •  •  1  Pjj  «d  hone*  the 
configuration  of  ^  e[2)-  —  ®[k]'  Thu*  *°T  *  ,u**iciently  •■•11  *. 

the  P*-condition  can  always  be  satisfied  by  choosing  an  appropriate  n  c  C j • 
If  we  define  S  to  be  the  site  of  the  selected  subset  when  the  procedure 

k 

terminates  then  E(S)  ■  l  ai  £  1  ♦  (k-1)  a.  . .  Then  we  can  replace  (7.8)  by 

i-1 


(7.9) 


(i)  a^fi.y)  >_  l-e  and 

(ii)  l-e  <  E(S)  <  lw-(k-l)e 


regardless  of  the  configuration  of  the  means  6^  e  e^.  The  experimenter 

has  for  any  6  e  (p^  ,  p^)  a  countably  infinite  number  of  procedures  n  which 

guarantee  (7.9).  Given  two  procedures  n,  n'  e  Cj  which  satisfy  (7.9),  the 

procedure  with  the  smaller  expected  number  of  stages  is  preferable  in  some  sense. 

If  M  «  max  M. ,  then  the  experimentor  will  want  to  use  a  minimax  rule, 

1  <i <k  1 

namely,  an  n  which  minimizes  M  over  the  subclass  C2  cCj  of  procedures 
satisfying  (7.9).  The  following  theorem  has  been  established  using  approximate 
value  of  M. 

Theorem  7.5.  For  6  e  (pfc  l,  pk) , 


such  that  a^  l-e,  y2(6)  is  the 
5*  is  the  value  of  6  such  that 


f 


(7.10) 


Tj(«) 

min  7 - 

6*<6<I  °‘pk-l 


min  M  «  < 
6 


_  min 
"6"<  6  <  6  w 


t2C«) 


pk 


where  y^(5)  is  the  first  positive  integer 
first  positive  integer  such  that  •y.j  e* 


for  6*  <  6 


for  T  <  6*  , 
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>j(«)  •  Y2(4)  *nd  T  •  (Pk  ♦  Pk_j)/2 


A  lew*  shows  that  the  approximate  wique  value  6*  is  given  by 


(7.11) 


6* 


log[(l-pkl)/(l-pk)] 

l0£[Pk(l‘Pk-l)/pk-lfl‘P^]  ' 

1/2 


lf  Vl  +  Pk  4  1 
if  Pk-1  "  pk  "  1  ' 


However,  there  still  remains  the  problem  of  choosing  a  specific  6  if 
6*  4  T.  It  has  been  found  empirically  by  Barron  (1968)  that  often  6*  *  T, 
so  that  the  experimenter  wili  not  be  "far"  from  the  minimum  for  any  choice  of 
6  between  6  and  6*.  Numerical  evidence  indicates  that  if  6  and  6*  are 
significantly  apart,  the  minimum  takes  place  near  6*.  It  seems  an  approxi¬ 
mate  minimax  rule  which  has  certain  desirable  properties  would  be  ^(n*) 
where  n*  =  ((<5*  m-y*},  { 6 *  m  +  y*}). 

Some  sample  size  comparisons  have  been  made  numerically  between  the  proce¬ 
dure  ^(n*)  and  the  fixed  sample-size  procedure  of  Gupta  (1965)  based  on  means 
of  samples  of  size  n  from  the  k  population,  which  is  denoted  here  by  R(n) 
and  defined  below. 


do 


R(n)r  Select  tk  iff  x^  >_  - 

i/n 


where  d  is  given  by  (7.2). 

The  comparison  was  made  with 

e[i] =  ••• +  V-i]  “ 9*  V)  ■ 6 


tion  e, 


'(1]  "  6’  6 [2] 
of  the  values  of  k,  t  and  P* 


0  ♦  T  ,  .  .  .  , 


(i)  Slippage  configuration: 


o  *  1  under  slippage  configuration 
♦  t,  t  >0,  and  the  equally-spaced  configura- 
0jk]  ■  0  ♦  (k-l)t,  t  >  0.  The  following  ranges 
were  considered: 

k  *  2(1)10,  25,  50;  t  -  0.05,  0-10(- 10)0-60, 


1,2;  P*  =  -75,  -90. 
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(ii)  Equally-Spaced  Configuration:  k  »  2(1' V  i  =  O.CS,  0-10(,10)0,60; 

P*  =  • 75 ,  -90. 

The  empirical  results  indicate  tnat  »^n*)  is  preferable  when  the  means  are 
close  and  R(n)  is  better  when  any  one  mean  gets  significantly  larger  than 
the  others. 

Guttman  (1963)  considers  a  sequential  procedure  for  a  goal  which  is  different 
from  the  usual  one.  Suppose  that  JL(i-l,..  ,k)  has  the  density  fQ  (x)  and 

the  quality  of  the  population  is  characterized  by  h^  *  g(6^)  where  g  is  a 
known  function.  Let  be  an  appropriate  statistic  based  on  a  sample  of  n 

independent  observations  in  the  sense  that  E(T)  is  g(9)  or  a  mcmotonic  func- 
tion  of  g (9) .  Consider  the  rule  R  which  selects  Jl.  iff 

(7.i2)  T.  e  u»n  k  (P*»  P 

where  wn  k  (P*,T)  is  a  random  linear  set  contained  in  the  sample  space  of 
and  depends  on  T  =  (T  ,...,  T,)  and  is  such  tnat  inf  P(CS|R)  =  P*. 

Since  the  size  of  the  selected  subset  is  random,  a  natural  question  is  how 
to  proceed  sequentially  so  that  we  could  select  one  population  as  the  best  or 
reduce  the  size  of  the  subset  selected  subject  to  certain  cost  considerations 
which  res* vie',  tne  number  of  stages. 

Let  t  denote  the  stage  of  the  experiment  and  kt  denote  the  number  of 
populations  retained  at  the  start  of  the  stage.  If  M  units  of  capital  are 
available  to  spend  on  the  procedure  and  at  each  stage  a  sample  of  nt  independent 

observations  are  taken  from  each  population,  let  t  be  the  largest  integer  for 
t 

o 

which  y  k  n  d  <  M  where  d  is  the  cost  per  observation. 

1=1  1  1  ~ 

Thu  scqucnti.il  procedure  proposed  and  investigated  by  Guttman  (1963)  is 


defined  below. 
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R* :  At  each  stage  t,  use  the  rule  R  with  P*  *  P*  where 

U 

1 '  B 

P*  «  1  -  — r-  adopting  the  following  stopping  rule: 

t  ->  £ 

C 

At  the  end  of  stage  t, 

(1)  Stop  if  t  -  t  . 

(2)  Stop  if  t  <  t  and  k  ,  =  1 

o  t+1 

(3)  Continue  if  t  <  tQ  and  >  1 

It  has  been  shown  that  P {CS | R ' }  >_  6.  Suppose  that  there  is  infinite  capital. 

"Ve  say  that  the  rule  R*  is  in  state  y  if,  at  any  stage  t,  we  have  k^  =  y  . 
The  states  form  a  Markov  chain  with  non-stationary  transition  probabilities 


(7.13) 


P  =  P{kt*l*  «lkt  =  Y>*  1ia^sktl 


These  are  dependent  on  u  .  (P?,T).  We  note  that  p  =  0  if  y  <  a  and 

r  n£  y  t  —  ye 


p  =1.  The  following  theorem  has  been  established  by  Guttman  (1963). 


Theorem  7.6.  Consider  the  Markov  chain  with  the  above  structure.  Let 

Paa(t)  =  1  -  fia(t),  0  <  $a(t)  <  1  for  a  i  1.  Then  the  Markov  chain  is  absorbed 

00 

at  state  1  (i.e.,  R'  terminates  at  a  finite  stage)  iff  \  6  (t)  diverges  for 

t«l 

all  a  1  1. 

It  might  be  possible  to  find  a  "reasonable"  value  of  n£  in  some  special 
cases.  Suppose  that  the  expected  subset  size  E(S)  at  stage  t  can  be  written 
as  a  function  of  nf ,  kt>  F*  and  the  differences  Since 

k^  and  P*  are  known,  if  we  have  information  about  the  differences  of  the  h^j, 
i  e  can  set  E(S)  =  1  and  solve  for  nt< 
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8 .  Selection  from  Restricted  Families  of  Distributions. 

There  are  situations  where  we  do  not  know  the  actual  functional  forms  of 
the  distributions  F\  ,  i  =  l,...,k,  associated  with  the  populations  but  have 
some  information  about  the  class  of  functions  to  which  they  belong  defined  in 
terms  of  a  partial  order  relation  with  respect  to  a  known  distribution  G. 

Such  families  do  occur  in  practical  problems.  In  these  cases  the  evaluation 
of  the  necessary  constants  for  the  procedures  depends  on  the  knowledge  of  G 
but  not  on  the  forms  of  the  themselves  and  in  this  restricted  sense  the 

procedures  are  distribution-free.  Barlow  and  Gupta  (1969)  have  discussed  selec¬ 
tion  procedures  for  restricted  families  of  distributions  mainly  in  terms  of 
their  quantiles.  We  will  briefly  discuss  here  these  procedures  and  indicate 
certain  other  related  problems. 

Assume  that  each  has  a  unique  a-quantile,  5^.  Let  denote 

the  cumulative  distribution  function  (cdf)  of  the  population  with  the  ith 
smallest  o-ouantile.  We  assume  that 


t 


(a)  {*)  F  (x) ,  i  =  1  >2, . . .  ,k  and  all  x, 

(8.1) 

(b)  there  exists  a  continuous  distribution  G  such  that 

F r . ,  <  G  for  all  i  =  l,...,k  , 
l1  J  ~ 

where  <  denotes  a  partial  ordering  relation  on  the  space  of  distributions. 

To  be  precise,  F  <  F  for  all  F  and  F  <  G,  G  <  H  *  F  <  H.  Note  that  F  <  G 

W  MM  M  M 

?nd  G  <  H  do  not  necessarily  imply  F  :  G. 

Some  special  cases  of  partial  ordering  which  are  of  interest  here  are: 

(i)  F  $  G  iff  F(0)  =  G(0)  =  0  and  G’1F(x)/x  is  nondecreasing  in 
x  >_  0  on  the  support  of  F. 

(ii)  F  •<  G  iff  G"1F(x)  is  convex  on  the  support  of  F. 
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(iii)  F  <  G  iff  F(0)  «  G(0)  ■  y  and  G  'F(x)/x  is  increasing  (de¬ 
creasing)  for  x  positive  (negative)  on  the  support  of  F. 

If  G(x)  =  1  -  e  x,  x _  >_ 0,  then  (i)  defines  the  class  of  IFRA  distributions 
studied  by  Bimbaum,  Esaty  and  Marshall  (1966)  while  (ii)  defines  the  class  of 
IFR  distributions  studied  by  Barlow,  Marshall  and  Proschan  (1963).  It  is  easy 
to  see  that  <  ordering  implies  <  ordering.  Implications  of  <  ordering 

C  #  f 

have  teen  studied  by  Lawrence  (1966).  Van  Zwet  (1964)  investigated  the  con¬ 
vex  ordering  and  s -ordering  (not  defined  above) . 

(a)  Quantile  selection  rules  for  distributions  ordered  w.r.t.  G. 

The  distributions  F^j  and  G  satisfy  the  assumptions  in  (8.1). 

Let  T.  .  denote  the  jth  order  statistic  based  on  n  independent  observations 
J 

from  where  j  <  (n+1)  a  <  j+1.  Then  for  selecting  the  population  with  the 
largest  a-quantile,  Barlow  and  Gupta  (1969)  proposed  the  rule 


where  0  <  c  =  c(k,P*,n,j)  <  1  is  determined  so  as  to  satisfy  the  P*-condition. 
It  has  been  shown  by  Barlow  and  Gupta  that 

00 

(8.3)  inf  P(CS|R)  =  /  [G  (x/c)]k‘1  dG,(x)  , 

ft  0  3  3 


where  ft  is  the  space  of  all  the  k-tuples  (F^-.-.F^)  and  G^(x)  is  the  cdf 
of  the  jth  order  statistic  based  on  n  independent  observations  from  G.  Thus 
the  constant  c  of  the  procedure  is  determined  by 


(8.4)  /  [G.(x/c)]k_1  dG.(x)  *  P* 

0  3  1 


and  is  tabulated  by  Barlow,  Gupta  and  Panchapakesan  (1969)  in  the  case  of 


70 


G(x)  a  i  -  e“*3  x  >  0  for  selected  values  of  n,  k,  j  and  P*.  For  j  «  1 , 

the  constant  c  is  easily  seen  to  be  independent  of  n. 

We  discussed  earlier  in  Section  6  a  non-parametric  procedure  Rj  studied 

by  Rizvi  and  Sobel  (1967)  for  the  quantile  selection  problem.  It  has  been  shown 

by  Barlow  and  Gupta  that  the  rules  R  and  Rj  are  asymptotically  equally 

efficient  in  the  sense  defined  by  (6.4)  under  the  scale  slippage  configuration. 

A  selection  rule  R*  proposed  by  Gupta  (1963)  for  ganma  populations  based 
or.  the  saaple  means  has  been  referred  to  in  Section  2.  Comparing  R  and  R* 

under  the  slippage  configuration  »  SX^,  0  <  6  <  l,  i  -  l,...,k-l, 

we  have 

7  -2  2  7  7 

(8.5)  A(R,R';  6)  >2(1-6)*  a  [-log  a|  /[r  (log  4)*  aa  (1  ♦  6*)]  , 
where  a  »  1  -  a.  Consequently  we  obtain 

(8.6)  A(R,R' ;  6  t  1)  >_  0.493  for  <*  *  1/2  . 


Barlow  and  Gupta  (1969)  also  considered  selection  in  terus  of  median  when 
the  distributions  F^(i  »  l,...,k)  have  lighter  tails  than  G  which  means  that 
centered  at  its  median,  is  <  -  ordered  w.r.t.  G  (G(0)»  and 

(d/dx)  F.(x  +  A.) I  >_  (d/dx)  G(x) I  ,.  In  order  to  select  the  population  with  the 
1  'x*0  'x-0 

largest  median,  the  following  rule  R2  was  proposed. 

R2:  Select  iff 

(8.7)  T.  .  >  max  T.  -  D,  j  <_  (n+l)/2  <  j+1  . 

3,1~'l<r<k  J  ’ 

It  was  shown  that  the  constant  D  >  0  satisfying  the  P*- condition  is  determined  by 

(8.8)  /  Ck*1  (t+D)  dG.(t)  -  P* 

.dm  +  J 


where  G . 

J 


is  as  defined  in  (8.3). 
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It  is  easy  to  show  that,  if  F  has  a  lighter  tail  than  G,  then  G"*F(x)-x 
is  increasing  in  x,  which  means  that  F  is  tail-ordered  w.r.t.  G(F  £  G)  accor¬ 
ding  to  a  definition  of  Doksum  (1969) .  As  a  matter  of  fact  the  rule  defined 
by  (8.7)  can  be  used  for  the  larger  class  of  distributions  F,  which  are 
tail-ordered  w.r.t.  G. 


(b)  Selection  w.r.t.  the  means  for  IFR  distributions. 

Let  be  the  mean  of  the  distribution  F^,  i  *  l,...,k,  and 
Fj- ^  denote  the  distribution  with  the  ith  largest  mean.  We  assume  that 


(a)  FfjjM  for  i  ■  l,...,k-l  and  all  x; 

(b)  F^j  <  G  for  i  »  l,...,k 


where  G(x)  *  1  -  e”x,  x  >_0.  We  also  assume  that  F^(0)  «  0  for  all  i. 

Let  x.  be  the  sample  mean  based  on  n  independent  observations  from  ir  and 
H.(x)  be  the  cdf  of  x^.  Let  denote  the  distribution  of  the  sample 


mean  from  F^j.  Thon 


(8.9) 

and 

(8.10) 


H^j(x)  >^H^j(x)  for  i  ««  l,...,k-l  and  all  x 


Hr .  i  <  G  for  i  =  1 , . . .  ,k  . 
UJ  c 


The  statement  in  (8.9)  is  an  immediate  consequence  of  the  assumption  (a)  above, 
while  (8.10)  follows  from  (b)  and  the  closure  of  IFR  distributions  under  convolu¬ 
tions  (see  Barlow,  Marshall  and  Proschan  (1963)).  For  selecting  a  subset  con¬ 
taining  the  population  Barlow  and  Gupta  (1969)  proposed  the  rule  Rj, 

namely. 


R^:  Select  the  population  iff 

x.  >_  c‘  max  x, 
1  l<j<k  J 


~  t-i 


(8.11) 


72 


where  the  constant  c'  (0  <  c*  <  1)  satisfying  the  P*-condition  is  given  by 

oo 

(8.12)  /  lG(x/c')]k'1  dG(x)  «  P*  . 

0 

The  disadvantage  of  the  rule  Rj  is  that  the  constant  c  obtained  from  (8.12) 
is  independent  of  n.  However,  by  restricting  the  class  of  distributions  to 
the  gamma  family  we  can  obtain  a  lower  bound  for  P{CS | >  which  depends  on  n. 

(c)  Some  results  relating  to  partial  orderings  of  distributions. 

The  two  procedures  R  and  R^  defined  by  (8.2)  and  (8.7)  for  the 
two  types  of  ordering  provides  the  motivation  for  an  attempt  by  Panchopakesan 
(1969)  to  unify  these  two  by  a  general  order  relation  which  throws  more  light 
on  a  lemma  of  Gupta  (1966b) .  We  define  the  general  ordering  here  in  a  slightly 
revised  form. 

Definition  8.1.  Let  M  =>  (h(x))  be  a  class  of  real-valued  function  on  the 
real  line.  Then  F  is  said  to  be  M-ordered  w.r.t.  G  if  F(0)  »  G(0) 
and  G_1F(h(x))  >.h(G-1F(x))  for  all  htH. 

We  note  that  if  M  «  (ax,  a  >_ 1}  and  F(0)  *  G(0)  »  0,  then  we  get 
star-ordering.  If  Mb  (x+b,  b  >_  0}  and  f(0)  =  G(0)  =  i,  then  M-ordering 
reduces  to  tail  ordering.  It  has  been  shown  that  M-ordering  is  a  partial 
ordering  and  that  order  statistics  preserve  the  ordering.  The  following  lemma 
is  the  key  result  we  need  to  bound  below  the  probability  of  a  correct  selection. 

Lemma  8,1.  If  F  |  G,  then,  for  any  positive  integer  t, 

(8.13)  /  Ft(h(x))  dF(x)  >  /  Gt(h(x))  dG(x) 

K. 


for  all  h  £ 
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Gupta  (1966b)  proved  the  following  lemma. 

Lemma  8.2.  X  is  a  random  variable  having  the  distribution  function  Fx(x). 
Let  h^fx)  be  a  class  of  functions  and  suppose  there  exists  a  distribution 
function  F(x)  such  that  ^b(gx (x))  I  8x(hb(x))  for  all  X  and  all  x, 
where  gx(x)  is  defined  by  Fx(gx(x))  -  F(x)  for  all  x.  Then  for  any 
t  >  0, 

(8.14)  /  Fj(hb(x))  dFx(x)>/  Ft(hb(x))  dF(x) . 

It  is  shown  that  the  assumption  of  Lemma  8.2  amounts  to  saying 
Fx  J^F.  A  general  selection  problem  discussed  by  Panchapakesan  (1969)  is  as 
follows.  Let  ffi****»\  he  k  populations  and  F^  is  the  distribution 
function  associated  with  ir^.  We  assume  that  there  exists  one  among  the  k 
populations  which  is  stochastically  larger  than  any  other.  Let  us  denote  the 
distribution  of  that  population  by  F^j-  Thus  we  have 

(8.15)  F^(x)  ^Fjkj(x)  for  i  *  l,...,k  and  all  x. 

It  is  also  assumed  that  there  exists  a  continuous  distribution  G  and  a  class 
of  realvalued  functions  M  ■  (h(x)}  such  that 

(8.16)  Fi(x)  j  G  for  i  -  1,  2,...,k  . 

If  X.  *  (X^.,  Xi2”"’Xin^  t*,e  °hserved  sample  from  ik,  then  we  confine 
ourselves  to  the  class  of  statistics  T^  »  T(X^)  that  preserve  both  the 
ordering  relations  (8.15)  and  (8.16).  Let  Fy  represent  the  cdf  of  TQ^) 
under  F^  and  Gy,  the  cdf  of  T(Y)  under  G,  where  Y  *  (Y1,.-..,Y  ) 
is  a  random  sample  from  G.  If  h(x)  >_  x,  then  for  selecting  a  subset  con¬ 
taining  the  population  associated  with  F^,  the  following  rule  R4  was 


proposed . 


R4:  Select  iff 

(8.17)  hCT^  >  ■wc(T1.....Tk)  . 

It  has  been  shown  that 

m  L 

(8.18)  P{CS|R4)  -  /  GjJ'1(h(x))  dOj.(x)  . 

■  flD 

If  h(x)  is  indexed  by  the  constants  c  and  d  (c  >  1,  d  >  0)  then  we  can 
find  suitable  constants  c  and  d  if  conditions  on  h(x)  given  in  the  very 
beginning  of  Section  3  are  satisfied. 
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9.  Bayes  and  Bnpirical  Bayes  Procedures . 

Let  y  •  (y^...,  y^)  e  E  (Euclidean  k-space)  be  an  observation 

of  the  random  vector  Y  ■  (Y^,,..,  Y^)  whose  components  are  independent 

random  variables,  Y^  having  the  density  f(yi|ei).  The  space  of  action 

is  denoted  by  G  and  it  consists  of  all  non-empty  subsets  of  k-populations 

(Yi  is  the  random  variable  associated  with  the  population  i-l,...,k). 

v 

A  selection  procedure  D  is  a  mapping  from  E  to  Q .  The  loss  incurred 
when  ■  (e.,...,  6k)  is  the  true  state  of  nature  and  D(y)  is  the 
subset  selected  is  denoted  by  L(D(y) ,  9) .  Let  G.^  be  the  a  priori 

k 

distributions  of  6,  and  G  »  n  G.  denotes  the  a  priori  distribution  on 
1  i-1  1 

the  parameter  space  ft.  The  Bayes  risk  of  a  decision  procedure  D 
w.r.t.  the  a  priori  distribution  G  is  defined  by 


(9.1) 

where 


R(D.G)  -  /  {/  L(D(y).e)  f(y|ej)  dy}  d  g  («)  , 

n  gk 


f(y|i)  -  n  fCyJe,)  . 

i-i  1  1 

A  Bayes  procedure  w.r.t.  G  is  a  procedure  D*  for  which  the  Bayes  risk 
is  minimia.  Suppose  we  consider  the  loss  function  in  selecting  the  subset 
Sj  given  by 


(».2)  Wr  V 


where  >  0  and  the  s 


1 

tion  is  over  all  populations  q  included  in  S 


J 


Deely  and  Gupta  (1968)  investigated  Bayes  procedures  with  the  above 


formulation. 


Before  stating  the  main  results  of  their  investigation,  we  adopt  the 
following  notation  for  tho  sequal. 
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S j  denotes  the  singleton  consisting  of  ,  1-1,...,  k.  The  remaining 

L 

2  -k-1  subsets  containing  two  or  more  populations  will  be  denoted  by  , 
w 

j*>k*l,...,  2  -1  with  no  explicit  ordering.  Further  let 

f(S.,y)  -  /  L(S  ,6)  f(y|6)  dG(6),  J-1,2,...,  2k-l 

3  fi  3 

(9.3)  aq  -  /  (e[k]-eq)  dG(£).  * 


a 


[1]“ 


min  a 

l^q<k  q 


Deely  and  Gupta  have  established  the  following  result. 

Theorem  9.1.  Let  the  loss  function  be  given  by  (9.2)  in  which  a^q-  a  >  0 

L 

for  j  -  1, ....  k.  If  l  a  Z.  a  {oT  *v**y  j  ■  1,2,. . . ,  2  -1,  then 

qeSj  3q  ~ 

the  Bayes  procedure  w.r.t.  G  for  selecting  a  subset  containing  the 
population  with  6^  is  given  by  D*  •  D*(y)  -  where  j  is  any  positive 
integer  1,2,...,  k  such  that 


(9.4)  *  nin 

This  result  is  applied  to  the  normal  means  problem  with  Gt  as 

2 

(i)  normal  with  mean  X^  and  variance  6^  and  (ii)  uniform  on  (X^-d^, 
X^  dA).  In  the  first  case,  the  Bayes  procedure  is: 

Select  ir  for  which 
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(9.5) 


2  - 

nfi.  x.  «■  X , 

li  1 

lVne; 


max 

i<j)<k 


2  - 

nBj  Xj 
l*nB? 


where  x^'s  are  sample  means  based  on  n  observation. 


Some  other  cases  like  selection  for  binomial  and  Poisson  populations 
where  the  parameters,  respectively,  have  beta  and  gaana  a  priori 
distributions  have  been  discussed  by  Deely  (196S)  who  has  also  investigated 
empirical  Bayes  procedures  for  the  selection  problem  which  we  presently 
discuss. 

In  the  empirical  Bayes  approach,  only  the  existence  of  an  a  priori 
distribution  G  on  the  parameter  space  is  assumed  and  not  a  particular 
G.  Thus  the  Bayes  procedure  is  not  available.  Suppose  independent 
observations  (xj,  Bj) ,  (x*,  d^) , . . . ,  (x^r  8^)  on  a  random  variable  X 
are  available  with  6^'s  all  being  drawn  from  the  same  distribution 
G.  (The  *  indicates  that  "r"  observations  from  each  population  have 
been  taken  for  i  *>  1,...,  n) .  The  "prior  observations"  contain  information 
about  G  and  thus  if  a  decision  procedure  Dn  based  upon  X*,...,  X* 
could  be  found  such  that  R(Dn,G)  converges  to  R(Dg,G)  (i.e.  the  Bayes 
risk  of  converges  to  the  Bayes  risk  of  the  Bayes  procedure  Dc  which 
we  would  use  if  we  knew  G  at  the  start)  for  any  G  in  some  family  G, 
then  the  procedure  is  asymptotically  optimal  to  and  is  called 

an  empirical  Bayes  procedure  w.r.t.  the  unknown  G.  The  main  theorem  of 
Deely  (1965)  proves  that  under  certain  regularity  conditions  the  Bayes 
procedure  w.r.t.  an  estimate  Gn  of  G  is  also  empirical  Bayes  w.r.t.  G. 
In  order  to  apply  this  theorem,  a  suitable  estimate  Gn  is  required. 

A  completely  satisfactory  answer  to  this  problem  is  not  available. 
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Suppose  we  make  an  additional  sssumption  that  G  belongs  to  a 
parametric  family  G  with  parameter  X_  «*  (X^,.,.,  X^) .  Suppose  now 
an  estimate  x^  of  X^  depending  on  the  prior  observations  from  the 
jth  population  can  be  found  such  that  based  on  the  observations 

converges  to  with  probatilit)  one.  Then  it  is  shown  that 

k  k 

G  *  n  G.  converges  to  G  *  n  G.  with  probability  one.  Further, 
j-l  3=1  J 

n  is  also  a  member  of  G.  Thus,  if  the  Bayes  procedure  w.r.t.  any 

G  in  G  is  available,  then  in  particular  G  is  available  and  thus 

r  n  ,n 

an  empirical  Bayes  procedure  w.r.t.  G  is  obtained.  Empirical  Bayes 

procedures  have  been  obtained  for  seve-i>.  c  pecial  cases  of  ftxle^  and 

G,  namely,  (i)  norma 1 -normal ,  (ii)  normal-uniform  (iii)  binomial-beta. 

Civ)  Poisson-gaima.  To  illustrate  the  type  of  results  obtained,  we 

consider  the  case  of  normal -normal . 

Let  tt.  (i=l,...,  k)  have  the  normal  density  f(xjei)  with  unknown 

2 

mean  6.^  and  known  variance  and  let  be  distributed  normally 

2 

with  unknown  hut  finite  mean  X^  and  known  variance  0^ .  let 

x*  x*  . . . ,  x*  be  independent  prior  observations  and  x*  the  present 

observation.  Then  the  empirical  Bayes  procedure  under  the  linear 

loss  function  in  (9.2)  with  o.  =  1,D_  (x*)  select  the  population  n. 

3  Q  vj  _  —  l 

v,n 

for  which 

(9.6)  Z.  =  max  Z. 

1  l^j^k  J 

where 


(9.7) 
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Xj  denotes  the  sample  mean  from  ir^  based  on  present  observation 
and  Xj  is  the  over-all  mean  of  the  prior  observations  from  . 

Similar  procedures  have  been  obtained  for  the  case  where  G  is 
subject  to  certain  very  general  conditions.  We  briefly  describe  one 
of  the  results  below  for  the  sake  of  illustration. 

Suppose  f (x I ® j )  be  a  normal  density  with  mean  6^  and  variance 

2 

.  Let  be  distributed  according  to  Gj  such  that.  /  6  d  G^  (e)  <  «  , 


j  a  l,...,k.  Let  xj,  x*,...,  x^J  be  independent  prior  observations 
and  x *  be  the  present  observations.  We  denote  the  mean  of  the  present 
observations  from  it.  by  x\  and  the  means  of  the  prior  observations  from 
if  j  by  7aj,  a  ■  1,...,  n.  Let  H^Cx.)  denote  (n+l)"1  times  the  total 
number  of  x^'?  which  are  <_  x^  including  the  present  observation  x\ . 
Define 


(9.8)  hnj(x.) 


VV  n',/S)  - 


j  ■  * 


C9-9)  gnj(x.)  = 


h  .  (x . ♦  n~1/S)  -  h  .(x.-  n"1/;>) 
nji  J  ni v  i 


Then  the  empirical  Bayes  procedure  under  linear  loss  function  (9.2) 

(with  ouq  *  1)  for  selecting  the  best  population  is  the  procedure  which 

_  <J2.  g.(x.) 

selects  the  population  tt.  (j  =  1,...,  k)  for  which  x.+  -* - J  -J-  is 

3  3  r  h  .(x.) 

njv  r 


maximum.  The  main  result  used  in  these  cases  is  a  result  due  to 


Robbins  (1964). 
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10.  Modified  Formulations  and  Goals 

In  the  preceding  sections  we  discussed  the  general  theory  of  subset 

selection  problems  under  the  usual  formulation  and  described  several  cases 

of  specific  distributions  and  ranking  criteria  used.  There  are,  however,  a 

few  other  cases  which  were  not  mentioned  earlier.  Barr  and  Rizvi  (1966) 

considered  the  problem  of  selecting  a  subset  containing  the  population  with 

the  largest  8  from  a  set  of  k  populations  having  uniform  distributions 

over  (0,  6^) ,  i>l,...,  k.  Guttman  (1961)  investigated  selection  problems 

using  the  coverage  probability  as  the  criterion  of  ranking.  If  it 

(i-1,...,  k)  is  described  by  the  sample  space  (XtG»PQ  )  where  P0  is  a 

probability  measure  belonging  to  the  class  (P0),  6  e  0,  the  populations 

are  ranked  according  to  b,  ■  /  dP.  ,  where  the  set  A  c  G  .  Guttman  has 

1  A  i 

discussed  specific  procedures  for  normal  and  exponential  distributions  with 
A  ■  (-»,a)  where  a  is  known  and  specified  in  advance. 

Several  authors  have  considered  formulations  and  goals  different  from 
the  usual  ones.  In  the  remaining  part  of  this  section  we  will  briefly  describe 
these  modifications. 

(a)  A  generalization  of  subset  selection  goal. 

Suppose  that  there  exists  a  binary  relation  <  which  orders  the  popula¬ 
tions  "j,...,  from  worst  to  best.  The  ordered  populations  are  denoted  by 
"(1)  <  *(2)  <  *  *  *  <  W(k)'  8*-ves  a  unique  t-subset  comprising  the  t 

best  populations,  namely,  1T(k-t>2)  *  *  *  *  »  "(k)*  for  tf1 

The  experimenter's  goal  is  to  select  a  subcollection  of  the  collection  of  all 
subsets  of  size  s  from  the  k  populations  such  that  at  least  one  such  se¬ 
lected  subset  contains  at  least  c  of  the  t  best  populations.  A  correct 
selection  is  a  realization  of  the  experimenter's  goal.  For  a  given  probability 


P*,  a  rule  R#  is  proposed  satisfying  the  condition  that  P(CS|Rs)  >_  P* 
no  matter  what  the  unknown  configuration  of  the  populations  it^. 

Of  course  in  a  meaningful  problem,  we  have  constraints  on  the  values  of 
t.s  and  c,  namely,  1  <  f  <k,  1  ^  s  <k,  max[l,s«-t+l-k]  <_  c  <_  min[s,t] . 

Let  j ,  j  •  1,...,  n^,  be  independent  random  variables  denoting  ob¬ 
servations  from  population  i^,  i  ■  1,...,  k.  Let  T^»  TfX^.X^.  •  •  •  f  X^n  )  , 

i  -  1,...,  k,  be  independent  statistics  with  absolutely  continuous  distribu¬ 
tions  (L,  =  G. ,  i  «  1, . . . ,  k,  suitably  chosen  such  that  w,  <  n.  m  T.  <_  T. , 

'i  1  1  -  3  1  si  3 

1  ^  i,  j  <_  k .  Let  tj^  be  an  observed  value  of  T\,  i  »  1, . . . ,  k.  Then  the 

rule  Rs  proposed  and  studied  by  Gupta  and  Deveraan  (1969)  is  the  following. 

Rs:  Consider  all  possible  s-subsets  (subsets  of  size  s)  of  ir^,.... 

Include  in  the  collection  of  s-subsets  the  s-subset  (ir,  ,  , ....  n  }  having 

1  2  xs 

the  observations  A  -  (t,  ,  t.  .....  t.  }  and  complementary  set  of  observations 

*1  x2  ls 


AC  -  (t. 


.  t.  }  iff  dfT^^A),  T[k_s](AC)]  >  -d*.  where  T(ij  (A)  is 


s+1  *k 

the  ith  smallest  element  in  any  finite  set  of  real  numbers,  d(x,y)  is  a  general¬ 
ized  difference  such  that  (i)  d(x,y)  ■  0  «*x»y,  (ii)  for  fixed  y  ■  y^,  d(x,yQ) 
is  increasing  in  x  and  (iii)  for  fixed  x  ■  x^,  d(Xg.y)  is  decreasing  in  y, 
and  the  constant  d*  0  is  chosen  so  that  the  P*  probability  condition  is 
satisfied.  For  the  procedure  R^,  i*.  has  been  shown  that  the  infimuir.  of  P(CS|Rs) 

occurs  when  all  the  populations  are  identical  w.r.t.  the  binary  relation  with 
which  they  are  ordered. 

Gupta  and  De verm an  have  also  discussed  the  normal  means  problem  in 
particular. 


(b)  Selecting  a  subset  better  than  a  standard 

Under  this  formulation  we  have  (k*l)  populations  (i=0, 1, . . . ,k+l) 
with  the  associated  distribution  functions  F@  .  The  parameters  9j,...,  0U 


are  unknown  and  the  parameter  8^  of  the  standard  population  nay  or  nay  not 
be  known.  The  goal  is  to  select  a  subset  containing  all  the  populations 
for  which  6^  _>  8^  (or  8^  <_  8^) .  Any  rule  R  defined  for  the  purpose  is 
required  to  satisfy  the  P*-condition. 

The  cases  of  location  and  scale  paraaeters  have  been  discussed  by  Gupta 
(1965).  Earlier  Gupta  and  Sobel  (1958)  have  considered  the  nornal  Jeans 
problem  where  the  procedure  based  on  sample  means  x \  (!■<),...,  k)  selects 

iff  >_Xq  -  A /Jri.  (It  is  assisted  that  all  populations  have  unit 
variance) . 

Puri  and  Puri  (1968,  1969)  have  investigated  rules  based  on  ranks  for 
the  location  and  scale  parameter  cases  and  have  studied  the  efficiency  of 
these  procedures  compared  to  the  normal  theory  procedures.  The  results  and 
techniques  of  these  investigations  are  similar  to  those  of  Lehmann  (1963). 

Nonparametric  selection  procedures  for  selecting  populations  better  than 
a  standard  when  the  comparison  is  in  terms  of  a-quantile  have  been  discussed 
by  Rizvi,  Sobel  and  Woodworth  (1968).  The  corresponding  subset  selection 
problem  under  the  usual  formulation  has  been  investigated  by  Rizvi  and  Sobel 
(1967)  and  has  been  discussed  in  Chapter  6. 

In  canparing  a  population  with  a  standard  Lehmann  (1961)  considered  a 
population  to  be  good  if  it  is  sufficiently  better  than  the  standard.  To  be 
precise,  let  ik  (i»l,...,  k)  be  a  population  whose  quality  is  characterized 
by  a  real-valued  parameter  8^  and  a  population  is  said  to  be  positive  (or 
good)  if  8^  _>  6q  ♦  A  and  negative  (or  bad)  if  6^  <_  8^,  where  A  is  a 
given  positive  constant  and  6^  is  either  a  given  number  or  a  parameter  -hat 
may  be  estimated.  A  negative  population  if  included  in  the  selected  subset  is 
called  a  false  positive .while  a  good  population  not  included  in  the  subset  is 
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called  a  false  negative.  Roughly  speaking,  the  aim  of  a  selection  procedure 
is  to  seek  out  the  positive  populations  while  holding  false  positives  in  the 
selected  subset  to  a  minimus. 

!  1  Let  S(8,6)  and  R (8,6)  denote  the  expected  number  of  true  positives 

and  false  positives,  respectively,  using  the  procedure  6.  Then  the  problem 

a 

is  to  determine  a  procedure  for  which  sup  R(8,J)  is  minimum  subject  to 
j  8  e  Q 

the  condition  that  inf  S(8,6)  >  y  where  ft  denotes  the  whole  parameter 
.  8  e  ft' 

space  and  ft'  denotes  the  set  of  parameter-points  for  which  at  least  one  of 
the  populations  is  positive. 

Under  certain  conditions,  Lehmann  (1961)  shows  that  a  rule  minimax  in 
the  above  sense  selects  when  :>  c. ,  where  is  a  suitable  statistic 
whose  distribution  depends  only  on  8^  and  where  c^  is  a  suitable  constant. 

He  has  also  discussed  the  applications  of  these  to  distributions  with  monotone 
likelihood  ratio  in  the  case  where  0Q  is  known  and  to  normal  distributions 
where  observations  on  0Q  are  included  in  the  experiment. 

Krishnaiah  and  Rizvi  (1966)  have  considered  the  problem  of  selecting  multi¬ 
variate  normal  populations  better  than  a  control  on  the  basis  of  the  linear 
combinations  of  the  elements  of  the  mean  veccors  of  the  populations.  Different 
i  definitions  of  positive  and  negative  populations  have  been  used  and  in  each  case 

.  a  selection  procedure  6  is  proposed  such  that  inf  P(u),6)  P*  or 

•  inf  S(u,6)  3^  p*  where  P(»,6)  denotes  the  probability  of  including  all  positive 

u 

,  populations,  S(u,6)  denotos  the  expected  proportion  of  true  positives  and  P* 

and  p*  are  given  constants.  As  an  illustration  of  the  type  of  results  obtained 

by  Krishnaiah  and  Rizvi,  consider  the  set  of  populations  w^,...,  and  the 

control  population  «0,  where  ik  (i»0,l,...,  k)  is  the  p-variate  normal 

distribution  N  (u,,I.).  Let  8  ■  a'  \i.  ,  (c-1,...,  r;  i»l,...,  k),  where 

p  — l  i  me  —  c^-a  *  v  *  ’  *  *  *  ' 
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are  specified  vectors.  The  population  is  said  to  be  positive 
if  eic-60c**c’  r*  w*«i  negative  if  elc  *  ®0c.  c-1, . . . ,  r, 

where  L ^  are  given  positive  constants.  For  the  case  of  known  Ejd-0,1, . .  ,,k)  , 
the  rule  6  proposed  selects  iff 

(10.1)  a^  -  x^/fc^n"1  l ^  n^1  ^^3 1/2  I  d,  c-1,...,  r, 

where  is  the  sample  mean  vector  from  ir^  based  on  n^  observations. 

Krishnaiah  (1967)  investigated  similar  procedures  when  the  comparison  of 

the  multivariate  normal  populations  with  the  control  population  is  based  on 
linear  combinations  of  elements  of  the  covariance  matrices,  determinants  of  the 
covariance  matrices  and  the  largest  (smallest)  characteristic  roots. 

Desu  (1970)  considered  the  selection  problem  where  the  populations  are 
not  compared  with  a  standard  but  rather  with  the  best  among  them.  If 
<*(ei,  i*  a  distance  measure  between  6^  and  6^  and  if  0aax“  max(01 , . . .  ,6^)  , 

population  v  is  said  to  be  superior  (or  good)  if  d(e  ,  e.)  <  6?  and  inferior 
(or  bad)  if  <*(e1Bax»  0^)  >  dj*  w^ere  dj,  6*  are  specified  constants  such  that 
0  <  <  6*.  For  the  location  and  scale  parameter  cases  which  have  been  considered, 

d(®£»  ej)  is  taken  to  be  6^-  0^  and  ej/9j  respectively.  The  proposed  proce¬ 
dure  R  selects  it ^  iff  d(Ynax,  Y^  <_  d (6^ »  c)  where  Y^  is  a  real-valued 
statistic  based  on  a  random  sample  of  size  n  from  whose  distribution  has 

6^  as  a  scale  (or  location)  parameter  and  the  constant  c  is  to  be  chosen  such 
that  the  P*-condition  is  satisfied.  The  correct  selection  here  is  the  selection 
of  a  subset  which  contains  no  inferior  population. 


Mahamunulu  (1967)  considered  a  selection  problem  under  the  indifference  - 


zone  approach  with  the  modified  goal  of  selecting  a  subset  of  size  s  which 
contains  at  least  c  of  the  t  best  populations  where  max(l,s«-t+l-k)<c5nin(s,t) . 
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Closely  related  to  Mahamunulu's  problem  of  determining  the  connon  sample  size 
required  for  a  given  subset  size  s,  is  the  problem  investigated  by  Desu  and 
Sobel  (1968).  Their  goal  is  to  select  the  smallest  possible  fired  subset 
size  s  that  Mill  contain  the  t  best  of  k  populations  (t  <^s  ^k),  based 
on  any  given  sample  size  from  each  population.  The  basic  probability  require¬ 
ment  is  met  under  the  usual  indifference -zone  set-up.  The  aim  in  the  modifica¬ 
tion  is  to  avoid  the  possible  inclusion  of  all  the  populations  in  the  selected 
subset.  The  smallest  fixed  subset  size  s  is  determined  as  a  function  of  the 
common  sample  size  n  and  the  specified  constants  but  not  of  the  observations. 

Nonparanetric  procedures  for  selecting  fixed-size  subsets  when  the  popula¬ 
tions  are  ranked  in  terms  of  a-quantiles  have  been  discussed  by  Desu  and 
Sobel  (1971).  The  random  subset  size  procedure  for  the  case  of  t  -  1  has  been 
earlier  studied  by  Rizvi  and  Sobel  (1967)  and  has  been  described  in  Chapter  6. 

Sobel  (1969)  investigated  the  problem  of  selecting  from  k  populations  a 
subset  containing  at  least  one  of  the  t-best  populations  for  given  t  and 
k(l  <  t  <  k)  under  an  indifference-zone  set-up.  For  t  *  1,  the  problem  is 
related  to  the  problem  of  Desu  and  Sobel  (1968).  The  procedures  proposed  by 
Sobel  select  a  subset  which  is  either  of  fixed  size  or  of  random  size  depending 
on  the  values  of  the  constants  specified. 
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